Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Parsing large JSON
#1
Hi!

I'm am very new to Python but love playing with it. What I want to accomplice is to get the content of the following JSON url: https://raw.githubusercontent.com/superm...rkets.json

Then parse that content for the different supermarkets named in the first n value (example, AH, Aldi, coop, etc). Then go trough all the products under the d value and print all the links.

I already have a code that extracts the content from the link in de JSON. So I only need the supermarket name and the product link.

Anyone any advise?

Greetings,
Jos
Reply
#2
A json file defines a datastructure. You don't "parse" a json file, you use the json.load() method to reproduce the structure. Loading your json file will create a list of dictionaries. In turn, each dictionary contains a list of dictionaries.
[
    {"n":"ah","d":[
        {"n":"'t IJ Kadoosje bierpakket","l":"wi394045/t-ij-kadoosje-bierpakket","p":17.99,"s":"6 x 0,33 l"},
        {"n":"&c","l":"wi410827/en-c","p":7.45,"s":"per stuk"},
        {"n":"&Then Cabernet sauvignon alcoholvrij","l":"wi549461/en-then-cabernet-sauvignon-alcoholvrij","p":6.99,"s":"0,75 l"},
        {"n":"&Then Chardonnay alcoholvrij","l":"wi549460/en-then-chardonnay-alcoholvrij","p":6.99,"s":"0,75 l"},
        {"n":"100 Watt Orchestra of angels","l":"wi437855/100-watt-orchestra-of-angels","p":3.29,"s":"0,33 l"},
        {"n":"100% Coconut grove","l":"wi415202/100-coconut-grove","p":1.99,"s":"1 l"},
        {"n":"1000 Stories Bourbon barrel aged Zinfandel","l":"wi473073/1000-stories-bourbon-barrel-aged-zinfandel","p":16.59,"s":"0,75 l"},
        {"n":"19 Crimes Chardonnay","l":"wi465846/19-crimes-chardonnay","p":9.49,"s":"0,75 l"},
        {"n":"19 Crimes Red blend","l":"wi465836/19-crimes-red-blend","p":9.49,"s":"0,75 l"},
        {"n":"19 Crimes Sauvignon blanc","l":"wi503579/19-crimes-sauvignon-blanc","p":9.49,"s":"0,75 l"},
...
The keys ("n", "d", "l", "p", "s") are not very descriptive.
Reply
#3
Hi deanhystad,

Thanks for the explanation! The json file is not mine but open to use. I believe that he shorten the keys to reduce the file size. Are you or someone else able to create a example code to print the supermarket name (n key) and the product url (l key)

Thanks!
Reply
#4
Is there a supermarket name? Is it "ah"? I mostly see inventory "d". You can read about the json module here:

https://docs.python.org/3/library/json.html
Reply
#5
Hi,

Yes the names of the supermarkets are in the first n key. Then after that in the d key are all the products that supermarket has. All the supermarkets are:
  • ah
  • aldi
  • coop
  • dekamarkt
  • dirk
  • hoogvliet
  • janlinders
  • jumbo
  • picnic
  • plus
  • spar
  • vomar
Reply
#6
The structure of the Json file is not very good,so you have to make into something that make sense.
Use eg JSON Editor Online to see structure better.
Example on how could start to make sense and do some search of the data.
import json

with open('supermarkets.json') as fp:
    data = json.load(fp)

#print(data)

supermarkets = 'AH'
for market in range(12):
    if supermarkets in data[market].values():
        print(f'{supermarkets} is a market')
        # 5 first relative links
        for item in range(5):
            print(data[market]['d'][item]['p'])
Output:
AH is a market wi394045/t-ij-kadoosje-bierpakket wi410827/en-c wi549461/en-then-cabernet-sauvignon-alcoholvrij wi549460/en-then-chardonnay-alcoholvrij wi437855/100-watt-orchestra-of-angels
So if change to supermarkets = 'Coop' will get 5 first from Coop.
So Coop just use number for relative links.
Output:
Coop is a market 8714319929120 8711757043012 8720182162120 8720182161970 9001442225977
If want price change to eg p
supermarkets = 'Vomar'
for market in range(12):
    if supermarkets in data[market].values():
        print(f'{supermarkets} is a market')
        # 5 first price
        for item in range(5):
            print(data[market]['d'][item]['p'])
Output:
Vomar is a market 2.09 2.89 2.89 2.39 0.89
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  validate large json file with millions of records in batches herobpv 3 1,286 Dec-10-2022, 10:36 PM
Last Post: bowlofred
  Parsing JSON pyStund 4 3,010 Jul-31-2022, 02:02 PM
Last Post: pyStund
  Json Parsing sshree43 5 1,808 May-04-2022, 09:21 PM
Last Post: snippsat
  json api data parsing elvis 0 937 Apr-21-2022, 11:59 PM
Last Post: elvis
  Initializing, reading and updating a large JSON file medatib531 0 1,799 Mar-10-2022, 07:58 PM
Last Post: medatib531
  string indices must be integers when parsing Json ilknurg 3 6,401 Mar-10-2022, 11:02 AM
Last Post: DeaD_EyE
  Help Parsing JSON kfwydfo1x 5 4,605 Jan-26-2021, 10:42 AM
Last Post: DeaD_EyE
  Parsing JSON with backslashes bazcurtis 3 9,344 Feb-08-2020, 01:13 PM
Last Post: bazcurtis
  JSON parsing (nested) fakka 0 3,082 Nov-25-2019, 09:25 PM
Last Post: fakka
  Parsing json - dictionary with in dictionary krish216 6 3,662 Jul-30-2019, 10:01 PM
Last Post: cvsae

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020