Python Forum
Converting a json file to a dataframe with rows and columns - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Converting a json file to a dataframe with rows and columns (/thread-39304.html)

Pages: 1 2


Converting a json file to a dataframe with rows and columns - eyavuz21 - Jan-28-2023

Hey all,

I am trying to convert a json file into a dataframe.

import pandas as pd 
import os 
from pandas.io.json import json_normalize

data = []

path_to_json = '/Users/macbook/Desktop/Saveddata'
json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
for i in range(0,len(json_files)):
    file = pd.read_json(r'/Users/macbook/Desktop/Saveddata/' + str(json_files[i]))

file2 = file.iloc[113:]
file2.iloc[0]
output:
game-data    [{'meta': {'user_id': '178054', 'level_id': 1,...
In the example above, I would like 'user_id' to be a column header, 'level_id' to be a column header etc. '170854' would be the first row of the first column and '1' would be the first row of the second column.

How can I achieve this?

I would be so grateful for a helping hand!


RE: Converting a json file to a dataframe with rows and columns - snippsat - Jan-28-2023

Use json_normalize().
Can look at a example in this post


RE: Converting a json file to a dataframe with rows and columns - Larz60+ - Jan-28-2023

the following will read all files into separate dataframes.
You will have to store each in a list (or whatever else you wish)

from pathlib import Path
import pandas as pd
# import json 


sourcedir = Path('/Users/macbook/Desktop/Saveddata')
filelist = [x for x in sourcedir.iterdir() if x.is_file() and x.suffix == '.json.']

for file in filelist:
    df = pd.pandas.read_json(file)
    # now do what you wish with the dataframe as next file will overwrite df

    # to read directly into a dictionary instead use:
    # with file.open() as jfile:
    #     jdata = json.load(jfile)



RE: Converting a json file to a dataframe with rows and columns - deanhystad - Jan-28-2023

That depends on what is "meta"? Is meta a user that has an id == 178054?
import json
import pandas as pd

json_string = '[{"meta": {"user_id": "178054", "level_id": 1}}, {"beta": {"user_id": "178055", "level_id": 2}}]'
game_data = json.loads(json_string)
names = [key for player in game_data for key, value in player.items()]
stats = [value for player in game_data for key, value in player.items()]
game_df = pd.DataFrame(stats, index=names)
print(game_df)
Output:
user_id level_id meta 178054 1 beta 178055 2
Or is meta something else? Should users be saved like this?
import json
import pandas

json_string = '[{"meta": {"user_id": ["178054", "178055"], "level_id": [1, 2]}}]'
game_data = json.loads(json_string)[0]["meta"]
game_df = pandas.DataFrame(game_data)
print(game_df)
Output:
user_id level_id 0 178054 1 1 178055 2
Seeing a tiny snapshot of your json is not enough. You need to tell us what the json file mans.


RE: Converting a json file to a dataframe with rows and columns - eyavuz21 - Jan-28-2023

(Jan-28-2023, 09:19 PM)deanhystad Wrote: That depends on what is "meta"? Is meta a user that has an id == 178054?
import json
import pandas as pd

json_string = '[{"meta": {"user_id": "178054", "level_id": 1}}, {"beta": {"user_id": "178055", "level_id": 2}}]'
game_data = json.loads(json_string)
names = [key for player in game_data for key, value in player.items()]
stats = [value for player in game_data for key, value in player.items()]
game_df = pd.DataFrame(stats, index=names)
print(game_df)
Output:
user_id level_id meta 178054 1 beta 178055 2
Or is meta something else? Should users be saved like this?
import json
import pandas

json_string = '[{"meta": {"user_id": ["178054", "178055"], "level_id": [1, 2]}}]'
game_data = json.loads(json_string)[0]["meta"]
game_df = pandas.DataFrame(game_data)
print(game_df)
Output:
user_id level_id 0 178054 1 1 178055 2
Seeing a tiny snapshot of your json is not enough. You need to tell us what the json file mans.

Thanks so much for your help - when I entered the line
game_data = json.loads(json_string)[0]["meta"]
I get the error message: TypeError: the JSON object must be str, bytes or bytearray, not Series. The variable json_loads looks like this:

game-data    [{'meta': {'user_id': '178054', 'level_id': 1,...
Name: 113, dtype: object
How can I resolve this?


RE: Converting a json file to a dataframe with rows and columns - deanhystad - Jan-28-2023

That is the same question as before with no additional information. Do you expect a different answer?


RE: Converting a json file to a dataframe with rows and columns - eyavuz21 - Jan-28-2023

(Jan-28-2023, 11:46 PM)deanhystad Wrote: That is the same question as before with no additional information. Do you expect a different answer?

Sorry if I was unclear with regards to your previous question. The answer should appear like this:

Output:
user_id level_id
0 178054 1
1 178055 2

When I try your code above, it gives me the error I mentioned above :)


RE: Converting a json file to a dataframe with rows and columns - deanhystad - Jan-29-2023

We know that part. What does the json look like? You only provide a tiny peek, not enough to understand what information is in the json file.


RE: Converting a json file to a dataframe with rows and columns - eyavuz21 - Jan-29-2023

(Jan-29-2023, 12:02 AM)deanhystad Wrote: We know that part. What does the json look like? You only provide a tiny peek, not enough to understand what information is in the json file.

Is this enough information? - this is what the output of 'file' looks like: (file = pd.read_json()). Let me know what else to print if not!

game-data
113	[{'meta': {'user_id': '178054', 'level_id': 1,...
114	[{'meta': {'user_id': '178054', 'level_id': 1,...
115	[{'meta': {'user_id': '178054', 'level_id': 2,...
116	[{'meta': {'user_id': '178054', 'level_id': 11...
117	[{'meta': {'user_id': '178054', 'level_id': 11...
...	...
799	[{'meta': {'user_id': '178245', 'level_id': 1,...
800	[{'meta': {'user_id': '178245', 'level_id': 2,...
801	[{'meta': {'user_id': '178245', 'level_id': 11...
802	[{'meta': {'user_id': '178245', 'level_id': 32...
803	[{'meta': {'user_id': '178245', 'level_id': 42...



RE: Converting a json file to a dataframe with rows and columns - deanhystad - Jan-29-2023

That is not a json file. game-data is not valid json, and json syntax uses double quotes instead of single quotes. The file also looks like it is several individual json strings, not one data structure. Where did you get the file?

I'm going to guess that the file looks like this:
Output:
game-data [{'meta': {'user_id': '178054', 'level_id': 1}}] [{'meta': {'user_id': '178054', 'level_id': 1}}] [{'meta': {'user_id': '178054', 'level_id': 2}}]
Using that format, I wrote a program that skips the first line. For each following line it replaces single quotes with double quotes and does a json.loads() to get a python list of dictionaries. Then, thinking there is only one dictionary in the list, it gets the first dictionary and gets the values associated with the "meta" key. That information is collected into a list. Finally the json information is normalized and loaded into a dataframe.
import json
import pandas as pd

game_data = []
with open('data.txt', 'r') as file:
    next(file)
    for line in file:
        line = line.replace("'", '"')
        game_data.append(json.loads(line)[0]['meta'])

df = pd.json_normalize(game_data)
print(df)
Output:
user_id level_id 0 178054 1 1 178054 1 2 178054 2