Python Forum

I would like to navigate through a complex JSON Dictionary and format it to be usable in an Excel file. I am fairly new to Python I've only been working with it for a couple of days.

import requests
import json
from pprint import pprint
import openpyxl
from openpyxl import Workbook

wb = Workbook()
ws = wb.active
dest_filename = 'MyTestSheet.xlsx'
ws.title = "MyTestSheet"

data_url = ''
response = requests.get(data_url)
response.raise_for_status()
data = response.json()

x = 1
y = 1

for i in range(0,100):
	ws.cell(row=x, column=y, value=str(data['features'][i]['attributes']))
	x = x+1

wb.save('MyTestSheet.xlsx')

What I want to do is print each item in the 'attributes' dictionary to its own row and then move to the next column and do the same for the next 'attributes' in 'features'. I know I can access the individual 'attributes' using the index like below.

data['features'][0]['attributes']

What I don't know how to do is how to loop through each string in the 'attributes' dictionary and print it to its own row in excel.

data['features'][0]['attributes'][0]

Above doesn't work and I think I know why its because:

type(data['features'])
Out[71]: list

type(data['features'][0]['attributes'])
Out[72]: dict

Additionally, I don't want to have to define a range for the index i = 0 to i = 100, coming from Excel VBA I am looking for something like usedrange.rows.count which essentially returns a count of the number of rows with data in them. That way my ranges could be dynamic if I use the same code with a different URL.

Think

The current code works but the output is crappy and I could do some excel manipulation to clean it up but why not shoot for a solution where python does it for me?

I really Heart

the way pprint() makes it look.

OUCH I CANT POST CLICKABLE LINKS Wall

data_url = 'x'

x has been deleted so I can post the link

Does it HAVE to be an excel document? Excel will open csv just fine, and generating csv is much easier to think about.
Also, dicts don't have ordered keys, which means each feature's attributes might be in a different order. So just writing them out in the order you find them might give wacky output.

Here's what I came up with. It keeps track of which attribute-keys it's seen, then uses that to make sure all values go into the right columns. Finally, it's all written to a csv file. Also, I faked the json data so it can be run and tested:

import csv

# fake json data
data = {
    "features": [
        {
            "attributes": {
                "color": "green",
                "size": "5",
                "can_rotate": "no"
            }
        },
        {
            "attributes": {
                "can_rotate": "yes",
                "color": "black"
            }
        },
        {
            "attributes": {
                "color": "red",
                "size": "9"
            }
        }
    ]
}

columns = []
all_rows = []
for feature in data["features"]:
    row = ["" for col in columns]
    for key, value in feature["attributes"].items():
        try:
            index = columns.index(key)
        except ValueError:
            # this column hasn't been seen before
            columns.append(key)
            row.append("")
            index = len(columns) - 1

        row[index] = value
    all_rows.append(row)

with open("my-file.csv", "w") as csvfile:
    writer = csv.writer(csvfile)
    # first row is the headers
    writer.writerow(columns)
    # then, the rows
    writer.writerows(all_rows)

This is great it works perfectly with my actual data. I will spend a little bit of time figuring out what you did and use it as a learning experience. Thank you!

I don't know what the json looks like, but if all attributes have all the properties, it can be simplified quite a bit. Most of what I wrote is just getting around possible missing attributes.

For example, if all the attributes always exist for each feature, it gets nearly simple:

with open("my-file.csv", "w") as csvfile:
    writer = csv.writer(csvfile)

    columns = [ ]
    for feature in data["features"]:
        if not columns:
            # add the headers
            columns = feature["attributes"].keys()
            writer.writerow(columns)

        row = [feature["attributes"][column] for column in columns]
        writer.writerow(row)

FYI:
dictionaries are ordered as of version 3.6

Ordered... predictably? Or do they just maintain the order of the originating data?
If the source json had the keys in different orders, would dicts sort them to be the same, or just maintain the order they originally had?

original order.

test:

def ziggy():
    unordered_dict = {
        'b': 'This is b',
        'h': 'This is h',
        'a': 'This is a',
        'c': 'This is c'
    }

    for key, value in unordered_dict.items():
        print('key: {} value: {}'.format(key, value))

results:

Output:key: b value: This is b
key: h value: This is h
key: a value: This is a
key: c value: This is c

Process finished with exit code 0

SimpleGuy

nilamo

SimpleGuy

nilamo

Larz60+

nilamo

Larz60+