Python Forum

Full Version: How to use data from an API, to create an alert?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

first off I'd like to say I'm very new, and this is my first attempt at using Python to create a useful program for myself, so go easy on me pls.

What I'm trying to create is a program that will search through the API of 4chans /biz/ catalog for some keywords and email me with the thread link when my keywords are being spoken about. To start off I've requested the API and attached it to a variable using:

import requests

response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = response.json()
When I print this it returns a lovely big list with other nested dictionaries and lists, as this image illustrates:

[Image: Dict.jpg]

So, I have an 11 item list, one for each page. One level deeper I have a dict -> page : with the value of the page #, and then a list of 20 dicts, with all of the keys and values for each thread.

What I'm trying to do is iterate through each thread, on each page and search the values of 'sub' and 'com' for my keywords, if they show up I want to return the value of the 'no' key for that thread. Then I can append that number to then end of a url and email it to myself every time a new post pops up.

I understand I'm biting off a bit more than I can chew here with my skillset, and I'm not asking for someone to write the code for me, rather, if someone knows any resources that could help me better understand how to search through complicated lists of nested lists and dicts, as alot of the tutorials I've found online either, aren't working with lists this complicated, or trying to do something completely different.

Anyways, thanks for reading,
Pop.
UPDATE:

found a good video that explained lists and dictionaries quite well here

this is where I'm at so far.

import requests
import json

def search(threads, lookup1):
    number = ['no']
    for k, v in threads.items():
        try:
            if lookup1 in v:
                return number

        except:
            continue

response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = json.loads(response.text)

for page in catalog:
    thread = page['threads']
    for threads in thread:
        search(threads, 'DOGE')
the function I've added is mostly guess work at this point, but i've managed to loop through the list of pages and I can print each individual "thread" dict now. I'm using 'DOGE' as an example keyword as mine are hardly ever mentioned, but still no luck actually returning any data.
Do I understand correctly that you want to find something?

import requests
import json

response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = json.loads(response.text)

print(catalog[0]['threads'][0].keys())

length = len(catalog)

the_name = "Anonymous"

for x in range(length - 1):
    my_range = 10
    for a in range(my_range):
        name = catalog[x]['threads'][a]['name']
        if name:
            if name == the_name:
                print("found", the_name, "at", a, x)
(Apr-20-2021, 09:20 PM)Axel_Erfurt Wrote: [ -> ]Do I understand correctly that you want to find something?

import requests
import json

response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = json.loads(response.text)

print(catalog[0]['threads'][0].keys())

length = len(catalog)

the_name = "Anonymous"

for x in range(length - 1):
    my_range = 10
    for a in range(my_range):
        name = catalog[x]['threads'][a]['name']
        if name:
            if name == the_name:
                print("found", the_name, "at", a, x)

Hi Axel, correct, I'm trying to search parts of each thread for something. Each thread is a dict containing a 'sub' and a 'com' key, I'm trying to search the value of these for some keywords (I'm using the keyword 'doge' to test). If a keyword is found I'd like to return the value of the 'no' key for that thread.

Thanks!
The first post was correct no need to mix in json library.
import requests
 
response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = response.json() # Correct
Some tips,also a couple of libraries(jmespath | nested-lookup) that can help search a nested dictionary that json can return.
import requests
from nested_lookup import nested_lookup
import jmespath

response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = response.json()

# Ordinay way
find_sub = catalog[0]['threads'][0]['sub']
print(find_sub)

# Find all <md5> using nested_lookup
find_md5 = nested_lookup('md5', catalog)[:5]
print(find_md5)

# Find thread <no> using jmespath
thread_no = jmespath.search('threads[*].no', catalog[0])
print(thread_no)
Output:
NO BEGGING ['pJvXhdg2z1jzhaXDpXJCSA==', 'FnhPHN6dqDf2BRAe/KlekQ==', 'otmlJIe2Z8Ie1tOznSJeTw==', 'bYhv5BMl3FDUl6Ij35KjFA==', 'aH0WKPS4g13LSXvNvXP8ZQ=='] [4884770, 21374000, 33546252, 33545084, 33529907, 33548153, 33544096, 33534099, 33546968, 33546648, 33547703, 33545399, 33533316, 33544886, 33547053, 33548145, 33547368, 33547858, 33545941, 33546379]
(Apr-20-2021, 10:27 PM)snippsat Wrote: [ -> ]The first post was correct no need to mix in json library.
import requests
 
response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = response.json() # Correct
Some tips,also a couple of libraries(jmespath | nested-lookup) that can help search a nested dictionary that json can return.
import requests
from nested_lookup import nested_lookup
import jmespath

response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = response.json()

# Ordinay way
find_sub = catalog[0]['threads'][0]['sub']
print(find_sub)

# Find all <md5> using nested_lookup
find_md5 = nested_lookup('md5', catalog)[:5]
print(find_md5)

# Find thread <no> using jmespath
thread_no = jmespath.search('threads[*].no', catalog[0])
print(thread_no)
Output:
NO BEGGING ['pJvXhdg2z1jzhaXDpXJCSA==', 'FnhPHN6dqDf2BRAe/KlekQ==', 'otmlJIe2Z8Ie1tOznSJeTw==', 'bYhv5BMl3FDUl6Ij35KjFA==', 'aH0WKPS4g13LSXvNvXP8ZQ=='] [4884770, 21374000, 33546252, 33545084, 33529907, 33548153, 33544096, 33534099, 33546968, 33546648, 33547703, 33545399, 33533316, 33544886, 33547053, 33548145, 33547368, 33547858, 33545941, 33546379]

So, are you saying the correct method would be to search for all of the 'sub' and 'com' values first, before trying to search for a keyword within them? Thanks for these libraries, these look very helpful.
(Apr-20-2021, 11:01 PM)PopFendi Wrote: [ -> ]So, are you saying the correct method would be to search for all of the 'sub' and 'com' values first, before trying to search for a keyword within them?
No was just demo how these libraries can help with a large nested return like this.
Not look at more precise what you want(describe it better),doge as you talk about look just like a plain text and not a key in the nested dictionary.
(Apr-20-2021, 11:36 PM)snippsat Wrote: [ -> ]
(Apr-20-2021, 11:01 PM)PopFendi Wrote: [ -> ]So, are you saying the correct method would be to search for all of the 'sub' and 'com' values first, before trying to search for a keyword within them?
No was just demo how these libraries can help with a large nested return like this.
Not look at more precise what you want(describe it better),doge as you talk about look just like a plain text and not a key in the nested dictionary.

Sorry, let me clarify.

'Doge' is the string I am trying to search for within the values of 'com' and 'sub'. If the 'Doge' string is found, I want it to return the value of 'no' (ie. the thread number) Perhaps my latest attempt might help to illustrate what I mean.

import requests
import json

response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = json.loads(response.text)

for page in catalog:
    thread = page['threads']
    for threads in thread:
        no = threads["no"]
        word = "doge"
        try:
            sub = threads["sub"]
            for value in sub:
                if word in value:
                    print(no)
        except:
            pass

        try:
            com = threads["com"]
            for value in com:
                if word in value:
                    print(no)
        except:
            continue
I've added the try / except because some threads do not contain either a 'com' or a 'sub' key, and I was getting a traceback.

Hope that makes sense,
Thank you!
I worked it out!

import requests
import json

response = requests.get('https://a.4cdn.org/biz/catalog.json')
catalog = json.loads(response.text)

for page in catalog:
    threads = page['threads']
    for thread in threads:
        no = thread["no"]
        try:
            sub = thread["sub"]
            if 'doge' not in sub:
                pass
            else:
                print(no)
        except:
            pass

        try:
            com = thread["com"]
            if 'doge' not in com:
                pass
            else:
                print(no)
        except:
            continue
this prints out the numbers I want, now I just need to work out the second part which is emailing myself the results as a link.