Posts: 22
Threads: 7
Joined: Apr 2020
Hey all,
I am trying to convert a json file into a dataframe.
import pandas as pd
import os
from pandas.io.json import json_normalize
data = []
path_to_json = '/Users/macbook/Desktop/Saveddata'
json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
for i in range(0,len(json_files)):
file = pd.read_json(r'/Users/macbook/Desktop/Saveddata/' + str(json_files[i]))
file2 = file.iloc[113:]
file2.iloc[0] output:
game-data [{'meta': {'user_id': '178054', 'level_id': 1,... In the example above, I would like 'user_id' to be a column header, 'level_id' to be a column header etc. '170854' would be the first row of the first column and '1' would be t he first row of the second column.
How can I achieve this?
I would be so grateful for a helping hand!
Posts: 7,315
Threads: 123
Joined: Sep 2016
Use json_normalize().
Can look at a example in this post
Posts: 12,028
Threads: 485
Joined: Sep 2016
the following will read all files into separate dataframes.
You will have to store each in a list (or whatever else you wish)
from pathlib import Path
import pandas as pd
# import json
sourcedir = Path('/Users/macbook/Desktop/Saveddata')
filelist = [x for x in sourcedir.iterdir() if x.is_file() and x.suffix == '.json.']
for file in filelist:
df = pd.pandas.read_json(file)
# now do what you wish with the dataframe as next file will overwrite df
# to read directly into a dictionary instead use:
# with file.open() as jfile:
# jdata = json.load(jfile)
Posts: 6,783
Threads: 20
Joined: Feb 2020
That depends on what is "meta"? Is meta a user that has an id == 178054?
import json
import pandas as pd
json_string = '[{"meta": {"user_id": "178054", "level_id": 1}}, {"beta": {"user_id": "178055", "level_id": 2}}]'
game_data = json.loads(json_string)
names = [key for player in game_data for key, value in player.items()]
stats = [value for player in game_data for key, value in player.items()]
game_df = pd.DataFrame(stats, index=names)
print(game_df) Output: user_id level_id
meta 178054 1
beta 178055 2
Or is meta something else? Should users be saved like this?
import json
import pandas
json_string = '[{"meta": {"user_id": ["178054", "178055"], "level_id": [1, 2]}}]'
game_data = json.loads(json_string)[0]["meta"]
game_df = pandas.DataFrame(game_data)
print(game_df) Output: user_id level_id
0 178054 1
1 178055 2
Seeing a tiny snapshot of your json is not enough. You need to tell us what the json file mans.
Posts: 22
Threads: 7
Joined: Apr 2020
Jan-28-2023, 10:23 PM
(This post was last modified: Jan-28-2023, 10:25 PM by eyavuz21.)
(Jan-28-2023, 09:19 PM)deanhystad Wrote: That depends on what is "meta"? Is meta a user that has an id == 178054?
import json
import pandas as pd
json_string = '[{"meta": {"user_id": "178054", "level_id": 1}}, {"beta": {"user_id": "178055", "level_id": 2}}]'
game_data = json.loads(json_string)
names = [key for player in game_data for key, value in player.items()]
stats = [value for player in game_data for key, value in player.items()]
game_df = pd.DataFrame(stats, index=names)
print(game_df) Output: user_id level_id
meta 178054 1
beta 178055 2
Or is meta something else? Should users be saved like this?
import json
import pandas
json_string = '[{"meta": {"user_id": ["178054", "178055"], "level_id": [1, 2]}}]'
game_data = json.loads(json_string)[0]["meta"]
game_df = pandas.DataFrame(game_data)
print(game_df) Output: user_id level_id
0 178054 1
1 178055 2
Seeing a tiny snapshot of your json is not enough. You need to tell us what the json file mans.
Thanks so much for your help - when I entered the line game_data = json.loads(json_string)[0]["meta"] I get the error message: TypeError: the JSON object must be str, bytes or bytearray, not Series. The variable json_loads looks like this:
game-data [{'meta': {'user_id': '178054', 'level_id': 1,...
Name: 113, dtype: object How can I resolve this?
Posts: 6,783
Threads: 20
Joined: Feb 2020
That is the same question as before with no additional information. Do you expect a different answer?
Posts: 22
Threads: 7
Joined: Apr 2020
(Jan-28-2023, 11:46 PM)deanhystad Wrote: That is the same question as before with no additional information. Do you expect a different answer?
Sorry if I was unclear with regards to your previous question. The answer should appear like this:
Output:
user_id level_id
0 178054 1
1 178055 2
When I try your code above, it gives me the error I mentioned above :)
Posts: 6,783
Threads: 20
Joined: Feb 2020
We know that part. What does the json look like? You only provide a tiny peek, not enough to understand what information is in the json file.
Posts: 22
Threads: 7
Joined: Apr 2020
Jan-29-2023, 12:08 AM
(This post was last modified: Jan-29-2023, 12:08 AM by eyavuz21.)
(Jan-29-2023, 12:02 AM)deanhystad Wrote: We know that part. What does the json look like? You only provide a tiny peek, not enough to understand what information is in the json file.
Is this enough information? - this is what the output of 'file' looks like: (file = pd.read_json()). Let me know what else to print if not!
game-data
113 [{'meta': {'user_id': '178054', 'level_id': 1,...
114 [{'meta': {'user_id': '178054', 'level_id': 1,...
115 [{'meta': {'user_id': '178054', 'level_id': 2,...
116 [{'meta': {'user_id': '178054', 'level_id': 11...
117 [{'meta': {'user_id': '178054', 'level_id': 11...
... ...
799 [{'meta': {'user_id': '178245', 'level_id': 1,...
800 [{'meta': {'user_id': '178245', 'level_id': 2,...
801 [{'meta': {'user_id': '178245', 'level_id': 11...
802 [{'meta': {'user_id': '178245', 'level_id': 32...
803 [{'meta': {'user_id': '178245', 'level_id': 42...
Posts: 6,783
Threads: 20
Joined: Feb 2020
Jan-29-2023, 02:26 AM
(This post was last modified: Jan-29-2023, 03:02 AM by deanhystad.)
That is not a json file. game-data is not valid json, and json syntax uses double quotes instead of single quotes. The file also looks like it is several individual json strings, not one data structure. Where did you get the file?
I'm going to guess that the file looks like this:
Output: game-data
[{'meta': {'user_id': '178054', 'level_id': 1}}]
[{'meta': {'user_id': '178054', 'level_id': 1}}]
[{'meta': {'user_id': '178054', 'level_id': 2}}]
Using that format, I wrote a program that skips the first line. For each following line it replaces single quotes with double quotes and does a json.loads() to get a python list of dictionaries. Then, thinking there is only one dictionary in the list, it gets the first dictionary and gets the values associated with the "meta" key. That information is collected into a list. Finally the json information is normalized and loaded into a dataframe.
import json
import pandas as pd
game_data = []
with open('data.txt', 'r') as file:
next(file)
for line in file:
line = line.replace("'", '"')
game_data.append(json.loads(line)[0]['meta'])
df = pd.json_normalize(game_data)
print(df) Output: user_id level_id
0 178054 1
1 178054 1
2 178054 2
|