Python Forum
Read Nested JSON with pandas.io.json
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Read Nested JSON with pandas.io.json
#1
Hi, I need help with read a JSON for next working with data.

How Can I get table with 4 columns: Data.Code; Data.snapshots.DateFrom; Data.snapshots.Address.Street; Data.snapshots.Address.City

This is my code, but it is necessary to correct it, but I do not how. The Code works but it returns 30 columns and not exactly what I want.

Thank you for your time Shy

import pandas as pd
import requests
import pandas.io.json as pd_json

log = ("user", "password")
url = "http://serverxyz/api/v1/Catalog/Categories?pageSize=2&pageIndex=0"

req = requests.get(url, auth = log)
req.raise_for_status()
fin = req.json()

df = pd_json.json_normalize(fin, 
                        record_path=['Data','snapshots'],
                        record_prefix = 'Data.',
                        errors = 'ignore'
                        )

print(df)
This is my JSON:

My JSON sample to download from OneDrive


Quote:{
"PageIndex": 0,
"PageSize": 2,
"TotalCount": 100248,
"Data": [
{
"Code": "859182400102974",
"Timestamp": "2019-04-17T12:16:51Z",
"Category": 0,
"snapshots": [
{
"Code": "859182400102974",
"DateFrom": "2016-12-31T23:00:00Z",
"DateTo": "2017-05-09T22:00:00Z",
"Type": "CCO",
"VoltageLevel": 400,
"IsIsland": false,
"IsPps": false,
"MeasurementType": "CMC",
"InstalledPower": 0,
"GridId": 11,
"MeteredDataProvider": "8591824048108",
"Supplier": "8591824071403",
"SubjectOfSettlement": "8591824071403",
"IsSummarizingForSubjectOfSettlement": false,
"AnnualConsumptionEstimation": -502,
"TDDClass": "004",
"TempArea": "009",
"IsForeign": false,
"IsSLRActive": false,
"DGIFrequency": 1,
"FirstMonthReading": 5,
"IsCompositeService": true,
"IsAggregatedInvoice": true,
"IsImplicitSoS": false,
"ReservedPower": 0,
"PhasesCount": "3",
"IsMicrosource": false,
"IsDisconnectionPlanned": false,
"Name": "Petra"
},
{
"Code": "859182400102974",
"DateFrom": "2017-05-09T22:00:00Z",
"DateTo": "2018-01-31T23:00:00Z",
"Type": "CCO",
"VoltageLevel": 400,
"IsIsland": false,
"IsPps": false,
"MeasurementType": "CMC",
"InstalledPower": 0,
"GridId": 11,
"MeteredDataProvider": "8591824048108",
"Supplier": "8591824071403",
"SubjectOfSettlement": "8591824071403",
"IsSummarizingForSubjectOfSettlement": false,
"AnnualConsumptionEstimation": -382,
"TDDClass": "004",
"TempArea": "009",
"IsForeign": false,
"IsSLRActive": false,
"DGIFrequency": 1,
"FirstMonthReading": 5,
"IsCompositeService": true,
"IsAggregatedInvoice": true,
"IsImplicitSoS": false,
"ReservedPower": 0,
"PhasesCount": "3",
"IsMicrosource": false,
"IsDisconnectionPlanned": false,
"Name": "Petra"
}
],
"scalars": {
"ConsumptionEstimation": [
{
"DateFrom": "2016-12-31T23:00:00Z",
"DateTo": "2017-05-09T22:00:00Z",
"ConsumptionEstimation": -502
},
{
"DateFrom": "2017-05-09T22:00:00Z",
"DateTo": "2018-01-31T23:00:00Z",
"ConsumptionEstimation": -382
}
],
"ConsumptionEstimation2": [
{
"DateFrom": "2016-12-31T23:00:00Z",
"DateTo": "2017-05-09T22:00:00Z",
"ConsumptionEstimation2": -502
},
{
"DateFrom": "2017-05-09T22:00:00Z",
"DateTo": "2018-01-31T23:00:00Z",
"ConsumptionEstimation2": -382
}
]
}
},
{
"Code": "859182400104897",
"Timestamp": "2019-04-17T12:16:51Z",
"Category": 0,
"snapshots": [
{
"Code": "859182400104897",
"DateFrom": "2016-11-18T23:00:00Z",
"DateTo": "2017-11-05T23:00:00Z",
"Type": "CCO",
"VoltageLevel": 400,
"IsIsland": false,
"IsPps": false,
"MeasurementType": "CMC",
"InstalledPower": 0,
"GridId": 11,
"MeteredDataProvider": "8591824048108",
"Supplier": "8591824071403",
"SubjectOfSettlement": "8591824071403",
"IsSummarizingForSubjectOfSettlement": false,
"AnnualConsumptionEstimation": -280,
"TDDClass": "004",
"TempArea": "009",
"IsForeign": false,
"Address": {
"Street": "Okružní",
"City": "Semovo Ústí",
"PostCode": "39102"
},
"IsSLRActive": false,
"DGIFrequency": 0,
"FirstMonthReading": 0,
"IsCompositeService": false,
"IsAggregatedInvoice": false,
"IsImplicitSoS": false,
"ReservedPower": 0,
"IsMicrosource": false,
"IsDisconnectionPlanned": false,
"Name": "Martin"
},
{
"Code": "859182400104897",
"DateFrom": "2017-11-05T23:00:00Z",
"DateTo": "2027-01-16T23:00:00Z",
"Type": "CCO",
"VoltageLevel": 400,
"IsIsland": false,
"IsPps": false,
"MeasurementType": "CMC",
"InstalledPower": 0,
"GridId": 11,
"MeteredDataProvider": "8591824048108",
"Supplier": "8591824071403",
"SubjectOfSettlement": "8591824071403",
"IsSummarizingForSubjectOfSettlement": false,
"AnnualConsumptionEstimation": -282,
"TDDClass": "004",
"TempArea": "009",
"IsForeign": false,
"Address": {
"Street": "Okružní",
"City": "Semovo Ústí",
"PostCode": "39102"
},
"IsSLRActive": false,
"DGIFrequency": 0,
"FirstMonthReading": 0,
"IsCompositeService": false,
"IsAggregatedInvoice": false,
"IsImplicitSoS": false,
"ReservedPower": 0,
"IsMicrosource": false,
"IsDisconnectionPlanned": false,
"Name": "Martin"
}
],
"scalars": {
"ConsumptionEstimation": [
{
"DateFrom": "2016-11-18T23:00:00Z",
"DateTo": "2017-11-05T23:00:00Z",
"ConsumptionEstimation": -280
},
{
"DateFrom": "2017-11-05T23:00:00Z",
"DateTo": "2027-01-16T23:00:00Z",
"ConsumptionEstimation": -282
}
],
"ConsumptionEstimation2": [
{
"DateFrom": "2016-11-18T23:00:00Z",
"DateTo": "2017-11-05T23:00:00Z",
"ConsumptionEstimation2": -280
},
{
"DateFrom": "2017-11-05T23:00:00Z",
"DateTo": "2027-01-16T23:00:00Z",
"ConsumptionEstimation2": -282
}
]
}
}
]
}
Reply
#2
You can access only some columns of df using:
new_df = df[['Data.Code', 'Data.DateFrom', 'Data.Address']]
Reply
#3
(Apr-23-2019, 03:24 PM)gontajones Wrote: You can access only some columns of df using:
new_df = df[['Data.Code', 'Data.DateFrom', 'Data.Address']]

- But what about column Data.Address? There is lot of information in one column but not separetly. Instead of {'Street': 'Okružní', 'City': 'Sezimovo Ústí', 'PostCode': '39102'} I want only 2 separate columns Street and City

- And prefix of column is not only Data.xyz but for examlpe Data.snapshots.DateFrom or Data.snapshots.Address.Street etc.

- And it is not better use "df = pd_json.json_normalize" for reading and assigning to "df" only columns which I want, not all columns?

I think this part of code is necessary to modify, but I do not how

df = pd_json.json_normalize(fin, 
                        record_path=['Data','snapshots'],
                        record_prefix = 'Data.',
                        errors = 'ignore'
                        )
Reply
#4
This is not the best way but it is a start...

import pandas as pd
import pandas.io.json as pd_json
import json

with open("data.json", "r") as fd:
    json_content = json.load(fd)
    my_df_list = []
    for _, entry in enumerate(json_content['Data']):
        my_df_list.append(pd_json.json_normalize(entry['snapshots'], record_prefix='Data.'))
    df = pd.concat(my_df_list, ignore_index=True, sort=False)
    print(df)
Reply
#5
I think, your code is good way how to start solve my problem.
Although I have a problem with transform it just like my ideas. I can not find simple example, how to go deeper or shallower in nested JSON (JSON with lot of levels). I want to know how to get one information from each level of JSON and put it into table.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Pandas read csv file in 'date/time' chunks MorganSamage 4 1,628 Feb-13-2023, 11:24 AM
Last Post: MorganSamage
  How to customize VS Code setting.json? NewPi 1 884 Dec-11-2022, 08:03 PM
Last Post: jefsummers
Smile How to further boost the data read write speed using pandas tjk9501 1 1,215 Nov-14-2022, 01:46 PM
Last Post: jefsummers
  Parse Nested JSON String in Python rwalde 4 2,817 Sep-08-2022, 10:32 AM
Last Post: rwalde
Information Parssing Json.dump from PYTHON to PHP for output on browser jodqueshiva 1 2,345 Nov-01-2021, 02:34 PM
Last Post: snippsat
  Write a dictionary with arrays as values into JSON format paul18fr 3 5,455 Oct-20-2021, 10:38 AM
Last Post: buran
  HELP! Importing json file into csv into jupyter notebook vilsef 2 2,512 Jan-22-2021, 11:06 AM
Last Post: snippsat
  Conversion CSV to JSON mati 1 1,767 Jan-09-2021, 08:19 PM
Last Post: jefsummers
  How to compare two json and write to third json differences with pandas and numpy onenessboy 0 4,608 Jul-24-2020, 01:56 PM
Last Post: onenessboy
  JSON file Loading issue punna111 4 8,455 Jun-29-2020, 08:07 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020