Python Forum
Converting a json file to a dataframe with rows and columns
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Converting a json file to a dataframe with rows and columns
#11
Thanks so much for your help. I have tried your code above - it gives me an empty dataframe.

When I try the following code:

import json
game_data = []

with open('/Users/macbook/Desktop/SavedSHQdatafullgame/All-gamedata-279-20230126-120924.json', 'r') as file:
    readfile = file.read()

json.loads(readfile)
json.loads(readfile) looks like this:

[{'meta': {'user_id': '178054',
     'level_id': 1,
     'duration': 16.6286621,
     'previous_attempts': 1,
     'early_termination': False,
     'platform': 'iOS',
     'app_version': '1.9.23',
     'map_view_duration': 7.3983},
readfile looks like this:

[{"meta":{"user_id":"178054","level_id":1,"duration":14.6162566999999992134462445392273366451263427734375,"previous_attempts":1,"early_termination":false,"platform":"ANDROID","app_version":"1.9.24","map_view_duration":15.2666930999999994611471265670843422412872314453125},"events":[{"time":0,
Reply
#12
(Jan-29-2023, 02:26 AM)deanhystad Wrote: That is not a json file. game-data is not valid json, and json syntax uses double quotes instead of single quotes. The file also looks like it is several individual json strings, not one data structure. Where did you get the file?

I'm going to guess that the file looks like this:
Output:
game-data [{'meta': {'user_id': '178054', 'level_id': 1}}] [{'meta': {'user_id': '178054', 'level_id': 1}}] [{'meta': {'user_id': '178054', 'level_id': 2}}]
Using that format, I wrote a program that skips the first line. For each following line it replaces single quotes with double quotes and does a json.loads() to get a python list of dictionaries. Then, thinking there is only one dictionary in the list, it gets the first dictionary and gets the values associated with the "meta" key. That information is collected into a list. Finally the json information is normalized and loaded into a dataframe.
import json
import pandas as pd

game_data = []
with open('data.txt', 'r') as file:
    next(file)
    for line in file:
        line = line.replace("'", '"')
        game_data.append(json.loads(line)[0]['meta'])

df = pd.json_normalize(game_data)
print(df)
Output:
user_id level_id 0 178054 1 1 178054 1 2 178054 2

Thanks so much for your help. I have tried your code above - it gives me an empty dataframe.

When I try the following code:

import json
game_data = []

with open('/Users/macbook/Desktop/SavedSHQdatafullgame/All-gamedata-279-20230126-120924.json', 'r') as file:
    readfile = file.read()

json.loads(readfile)
json.loads(readfile) looks like this:

[{'meta': {'user_id': '178054',
     'level_id': 1,
     'duration': 16.6286621,
     'previous_attempts': 1,
     'early_termination': False,
     'platform': 'iOS',
     'app_version': '1.9.23',
     'map_view_duration': 7.3983},
readfile looks like this:

[{"meta":{"user_id":"178054","level_id":1,"duration":14.6162566999999992134462445392273366451263427734375,"previous_attempts":1,"early_termination":false,"platform":"ANDROID","app_version":"1.9.24","map_view_duration":15.2666930999999994611471265670843422412872314453125},"events":[{"time":0,
Reply
#13
(Jan-29-2023, 11:02 AM)eyavuz21 Wrote: Thanks so much for your help. I have tried your code above - it gives me an empty dataframe.

When I try the following code:
You mess the json file to a string when doing that,most use json.load(file) and not file.read()
You post complete raw json data,or working smample of it it's big.

Here a comlete example that help if you look at undertand how it works.
Use eg jsoncrack to see the structure better and that is vaild json file.
game.json
Output:
{ "results": [ { "leagues": [ { "matches": [ { "league": "Premier League", "teams": "Liverpool - Chelsea", "score": [ "3:0" ] }, { "league": "Premier League", "teams": "Man Utd - Arsenal", "score": [ "2:1" ] } ] }, { "matches": [ { "league": "La Liga", "teams": "Atletico Madrid - Villareal", "score": [ "0:2" ] } ] } ] } ] }
I want this result of game.json as a DataFrame.
Output:
league teams score 0 Premier League Liverpool - Chelsea [3:0] 1 Premier League Man Utd - Arsenal [2:1]
Then i would use json_normalize like this in code under.
import pandas as pd
import json

with open('game.json') as file:
    json_data = json.load(file)

df = pd.json_normalize(json_data['results'][0]['leagues'][0], record_path='matches')
Look at result.
>>> df
           league                teams  score
0  Premier League  Liverpool - Chelsea  [3:0]
1  Premier League    Man Utd - Arsenal  [2:1]
Reply
#14
(Jan-29-2023, 01:56 PM)snippsat Wrote:
(Jan-29-2023, 11:02 AM)eyavuz21 Wrote: Thanks so much for your help. I have tried your code above - it gives me an empty dataframe.

When I try the following code:
You mess the json file to a string when doing that,most use json.load(file) and not file.read()
You post complete raw json data,or working smample of it it's big.

Here a comlete example that help if you look at undertand how it works.
Use eg jsoncrack to see the structure better and that is vaild json file.
game.json
Output:
{ "results": [ { "leagues": [ { "matches": [ { "league": "Premier League", "teams": "Liverpool - Chelsea", "score": [ "3:0" ] }, { "league": "Premier League", "teams": "Man Utd - Arsenal", "score": [ "2:1" ] } ] }, { "matches": [ { "league": "La Liga", "teams": "Atletico Madrid - Villareal", "score": [ "0:2" ] } ] } ] } ] }
I want this result of game.json as a DataFrame.
Output:
league teams score 0 Premier League Liverpool - Chelsea [3:0] 1 Premier League Man Utd - Arsenal [2:1]
Then i would use json_normalize like this in code under.
import pandas as pd
import json

with open('game.json') as file:
    json_data = json.load(file)

df = pd.json_normalize(json_data['results'][0]['leagues'][0], record_path='matches')
Look at result.
>>> df
           league                teams  score
0  Premier League  Liverpool - Chelsea  [3:0]
1  Premier League    Man Utd - Arsenal  [2:1]

Thank you so much for your help. I have now managed to find a solution:

import json
import pandas as pd

newdataframe = pd.DataFrame(index=range(705),columns=range(9))

with open('/Users/macbook/Desktop/SavedSHQdatafullgame/All-gamedata-279-20230129-023739.json', 'r') as file:
    json_data = json.load(file)
 
for i in range(115,len(json_data['game-data'])): 
    relevantrow = pd.json_normalize(json_data['game-data'][i])
    j = i - 115 
    newdataframe.iloc[j,0] = relevantrow['player'][0]
    newdataframe.iloc[j,1] = relevantrow['meta.user_id'][0]
    newdataframe.iloc[j,2] = relevantrow['meta.level_id'][0]
    newdataframe.iloc[j,3] = relevantrow['meta.duration'][0]
    newdataframe.iloc[j,4] = relevantrow['meta.previous_attempts'][0]
    newdataframe.iloc[j,5] = relevantrow['meta.early_termination'][0]
    newdataframe.iloc[j,6] = relevantrow['meta.platform'][0]
    newdataframe.iloc[j,7] = relevantrow['meta.app_version'][0]
    newdataframe.iloc[j,8] = relevantrow['meta.map_view_duration'][0]
 
newdataframe = newdataframe.rename(columns={0: "Coordinates",1:"ID",2:"Level",3:"Duration",4:"Previousattempts?",5:"Earlytermination?",6:"Platform",7:"Appversion",8:"Durationofmapview"})
newdataframe.to_csv('/Users/macbook/Desktop/SavedSHQdatafullgame/fulldata.csv')
This results in 'newdataframe' looking like this:

Coordinates	ID	Level	Duration	Previousattempts?	Earlytermination?	Platform	Appversion	Durationofmapview
0	[{'x': 11, 'y': 12, 'r': 2}, {'x': 11, 'y': 12...	178054	1	16.628662	1	False	iOS	1.9.23	7.3983
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Converting column of values into muliple columns of counts highland44 0 205 Feb-01-2024, 12:48 AM
Last Post: highland44
  Converting .txt to .csv file SunWers 21 11,713 Jan-20-2024, 10:03 AM
Last Post: Larz60+
  Create Choices from .ods file columns cspower 3 518 Dec-28-2023, 09:59 PM
Last Post: deanhystad
  parse json field from csv file lebossejames 4 668 Nov-14-2023, 11:34 PM
Last Post: snippsat
  Create csv file with 4 columns for process mining thomaskissas33 3 691 Nov-06-2023, 09:36 PM
Last Post: deanhystad
  Python Script to convert Json to CSV file chvsnarayana 8 2,343 Apr-26-2023, 10:31 PM
Last Post: DeaD_EyE
  Loop through json file and reset values [SOLVED] AlphaInc 2 1,960 Apr-06-2023, 11:15 AM
Last Post: AlphaInc
  Reading Specific Rows In a CSV File finndude 3 940 Dec-13-2022, 03:19 PM
Last Post: finndude
  How to properly format rows and columns in excel data from parsed .txt blocks jh67 7 1,799 Dec-12-2022, 08:22 PM
Last Post: jh67
  validate large json file with millions of records in batches herobpv 3 1,221 Dec-10-2022, 10:36 PM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020