Python Forum
Converting a json file to a dataframe with rows and columns
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Converting a json file to a dataframe with rows and columns
#1
Hey all,

I am trying to convert a json file into a dataframe.

import pandas as pd 
import os 
from pandas.io.json import json_normalize

data = []

path_to_json = '/Users/macbook/Desktop/Saveddata'
json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
for i in range(0,len(json_files)):
    file = pd.read_json(r'/Users/macbook/Desktop/Saveddata/' + str(json_files[i]))

file2 = file.iloc[113:]
file2.iloc[0]
output:
game-data    [{'meta': {'user_id': '178054', 'level_id': 1,...
In the example above, I would like 'user_id' to be a column header, 'level_id' to be a column header etc. '170854' would be the first row of the first column and '1' would be the first row of the second column.

How can I achieve this?

I would be so grateful for a helping hand!
Reply
#2
Use json_normalize().
Can look at a example in this post
Reply
#3
the following will read all files into separate dataframes.
You will have to store each in a list (or whatever else you wish)

from pathlib import Path
import pandas as pd
# import json 


sourcedir = Path('/Users/macbook/Desktop/Saveddata')
filelist = [x for x in sourcedir.iterdir() if x.is_file() and x.suffix == '.json.']

for file in filelist:
    df = pd.pandas.read_json(file)
    # now do what you wish with the dataframe as next file will overwrite df

    # to read directly into a dictionary instead use:
    # with file.open() as jfile:
    #     jdata = json.load(jfile)
Reply
#4
That depends on what is "meta"? Is meta a user that has an id == 178054?
import json
import pandas as pd

json_string = '[{"meta": {"user_id": "178054", "level_id": 1}}, {"beta": {"user_id": "178055", "level_id": 2}}]'
game_data = json.loads(json_string)
names = [key for player in game_data for key, value in player.items()]
stats = [value for player in game_data for key, value in player.items()]
game_df = pd.DataFrame(stats, index=names)
print(game_df)
Output:
user_id level_id meta 178054 1 beta 178055 2
Or is meta something else? Should users be saved like this?
import json
import pandas

json_string = '[{"meta": {"user_id": ["178054", "178055"], "level_id": [1, 2]}}]'
game_data = json.loads(json_string)[0]["meta"]
game_df = pandas.DataFrame(game_data)
print(game_df)
Output:
user_id level_id 0 178054 1 1 178055 2
Seeing a tiny snapshot of your json is not enough. You need to tell us what the json file mans.
Reply
#5
(Jan-28-2023, 09:19 PM)deanhystad Wrote: That depends on what is "meta"? Is meta a user that has an id == 178054?
import json
import pandas as pd

json_string = '[{"meta": {"user_id": "178054", "level_id": 1}}, {"beta": {"user_id": "178055", "level_id": 2}}]'
game_data = json.loads(json_string)
names = [key for player in game_data for key, value in player.items()]
stats = [value for player in game_data for key, value in player.items()]
game_df = pd.DataFrame(stats, index=names)
print(game_df)
Output:
user_id level_id meta 178054 1 beta 178055 2
Or is meta something else? Should users be saved like this?
import json
import pandas

json_string = '[{"meta": {"user_id": ["178054", "178055"], "level_id": [1, 2]}}]'
game_data = json.loads(json_string)[0]["meta"]
game_df = pandas.DataFrame(game_data)
print(game_df)
Output:
user_id level_id 0 178054 1 1 178055 2
Seeing a tiny snapshot of your json is not enough. You need to tell us what the json file mans.

Thanks so much for your help - when I entered the line
game_data = json.loads(json_string)[0]["meta"]
I get the error message: TypeError: the JSON object must be str, bytes or bytearray, not Series. The variable json_loads looks like this:

game-data    [{'meta': {'user_id': '178054', 'level_id': 1,...
Name: 113, dtype: object
How can I resolve this?
Reply
#6
That is the same question as before with no additional information. Do you expect a different answer?
Reply
#7
(Jan-28-2023, 11:46 PM)deanhystad Wrote: That is the same question as before with no additional information. Do you expect a different answer?

Sorry if I was unclear with regards to your previous question. The answer should appear like this:

Output:
user_id level_id
0 178054 1
1 178055 2

When I try your code above, it gives me the error I mentioned above :)
Reply
#8
We know that part. What does the json look like? You only provide a tiny peek, not enough to understand what information is in the json file.
Reply
#9
(Jan-29-2023, 12:02 AM)deanhystad Wrote: We know that part. What does the json look like? You only provide a tiny peek, not enough to understand what information is in the json file.

Is this enough information? - this is what the output of 'file' looks like: (file = pd.read_json()). Let me know what else to print if not!

game-data
113	[{'meta': {'user_id': '178054', 'level_id': 1,...
114	[{'meta': {'user_id': '178054', 'level_id': 1,...
115	[{'meta': {'user_id': '178054', 'level_id': 2,...
116	[{'meta': {'user_id': '178054', 'level_id': 11...
117	[{'meta': {'user_id': '178054', 'level_id': 11...
...	...
799	[{'meta': {'user_id': '178245', 'level_id': 1,...
800	[{'meta': {'user_id': '178245', 'level_id': 2,...
801	[{'meta': {'user_id': '178245', 'level_id': 11...
802	[{'meta': {'user_id': '178245', 'level_id': 32...
803	[{'meta': {'user_id': '178245', 'level_id': 42...
Reply
#10
That is not a json file. game-data is not valid json, and json syntax uses double quotes instead of single quotes. The file also looks like it is several individual json strings, not one data structure. Where did you get the file?

I'm going to guess that the file looks like this:
Output:
game-data [{'meta': {'user_id': '178054', 'level_id': 1}}] [{'meta': {'user_id': '178054', 'level_id': 1}}] [{'meta': {'user_id': '178054', 'level_id': 2}}]
Using that format, I wrote a program that skips the first line. For each following line it replaces single quotes with double quotes and does a json.loads() to get a python list of dictionaries. Then, thinking there is only one dictionary in the list, it gets the first dictionary and gets the values associated with the "meta" key. That information is collected into a list. Finally the json information is normalized and loaded into a dataframe.
import json
import pandas as pd

game_data = []
with open('data.txt', 'r') as file:
    next(file)
    for line in file:
        line = line.replace("'", '"')
        game_data.append(json.loads(line)[0]['meta'])

df = pd.json_normalize(game_data)
print(df)
Output:
user_id level_id 0 178054 1 1 178054 1 2 178054 2
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  encrypt data in json file help jacksfrustration 1 52 5 hours ago
Last Post: deanhystad
  Converting column of values into muliple columns of counts highland44 0 205 Feb-01-2024, 12:48 AM
Last Post: highland44
  Converting .txt to .csv file SunWers 21 11,718 Jan-20-2024, 10:03 AM
Last Post: Larz60+
  Create Choices from .ods file columns cspower 3 519 Dec-28-2023, 09:59 PM
Last Post: deanhystad
  parse json field from csv file lebossejames 4 668 Nov-14-2023, 11:34 PM
Last Post: snippsat
  Create csv file with 4 columns for process mining thomaskissas33 3 693 Nov-06-2023, 09:36 PM
Last Post: deanhystad
  Python Script to convert Json to CSV file chvsnarayana 8 2,344 Apr-26-2023, 10:31 PM
Last Post: DeaD_EyE
  Loop through json file and reset values [SOLVED] AlphaInc 2 1,962 Apr-06-2023, 11:15 AM
Last Post: AlphaInc
  Reading Specific Rows In a CSV File finndude 3 940 Dec-13-2022, 03:19 PM
Last Post: finndude
  How to properly format rows and columns in excel data from parsed .txt blocks jh67 7 1,802 Dec-12-2022, 08:22 PM
Last Post: jh67

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020