Posts: 5
Threads: 2
Joined: May 2018
Aug-12-2022, 05:09 AM
(This post was last modified: Aug-12-2022, 05:09 AM by vijays3.)
Hi All
I need your guidance :
I have three Jason files which in my local drive which need to be loaded in one data frame so that I can do the further exploratory data analysis. I have written a code but this only loads last file in the data frame . Kindly please find the below code for reference.
I have also attached the Json files and notebook code screen shot.
import json
import pandas as pd
import os
path_to_json ='C://JsonFiles//'
def file_path(path_to_json):
json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
return json_files
for jf in file_path(path_to_json):
with open(jf,'r') as f:
data = json.loads(f.read())
# Flatten data
df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
df_nested_list
Attached Files
Thumbnail(s)
JsonFiles.zip (Size: 38.4 KB / Downloads: 204)
Posts: 12,025
Threads: 484
Joined: Sep 2016
Try:
import pandas as pd
import os
from pathlib import Path
jpath = Path('C://JsonFiles//')
filelist = [filename for filename in jpath.iterdir()
if filename.is_file() and filename.suffix == '.json']
df = pd.DataFrame([pd.read_json(file, typ="series") for file in filelist])
print(df)
Posts: 5
Threads: 2
Joined: May 2018
Aug-12-2022, 09:28 AM
(This post was last modified: Aug-12-2022, 09:28 AM by vijays3.)
(Aug-12-2022, 09:11 AM)Larz60+ Wrote: Try:
import pandas as pd
import os
from pathlib import Path
jpath = Path('C://JsonFiles//')
filelist = [filename for filename in jpath.iterdir()
if filename.is_file() and filename.suffix == '.json']
df = pd.DataFrame([pd.read_json(file, typ="series") for file in filelist])
print(df)
Thanks Larz, but it does not work what I am expecting. If you look at my json files and review my code, I am trying to load those into one single dataframe with correct column names (I have attached one more screen with the data frame columns). json_normalize function and meta columns are used to extract the data hierarchical data from the Json.
My code works fine but it does not load all the given file's data in DF , only loads last one. Kindly please help to twick my code so that all files loaded in one DF.
Attached Files
Thumbnail(s)
Posts: 7,313
Threads: 123
Joined: Sep 2016
Try.
import json
import pandas as pd
import os
path_to_json ='C://JsonFiles//'
def file_path(path_to_json):
json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
return json_files
df = pd.DataFrame()
for jf in file_path(path_to_json):
with open(jf,'r') as f:
data = json.loads(f.read())
# Flatten data
df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
df = df.append(df_nested_list)
Posts: 5
Threads: 2
Joined: May 2018
(Aug-12-2022, 01:29 PM)snippsat Wrote: Try.
import json
import pandas as pd
import os
path_to_json ='C://JsonFiles//'
def file_path(path_to_json):
json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
return json_files
df = pd.DataFrame()
for jf in file_path(path_to_json):
with open(jf,'r') as f:
data = json.loads(f.read())
# Flatten data
df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
df = df.append(df_nested_list)
Hi snippsat
It does not load the desired data. I have attached the screen shot what I am looking for and the result from new code
Attached Files
Thumbnail(s)
Posts: 7,313
Threads: 123
Joined: Sep 2016
Test of code using your files.
import json
import os
import pandas as pd
pd.set_option('display.max_column', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_seq_items', None)
pd.set_option('display.max_colwidth', 500)
pd.set_option('expand_frame_repr', False)
path_to_json = r'G:\div_code\JsonFiles\\'
def file_path(path_to_json):
json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
return json_files
df = pd.DataFrame()
for jf in file_path(path_to_json):
with open(jf,'r') as f:
data = json.loads(f.read())
# Flatten data
df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
df = df.append(df_nested_list) Look at 20 first and 20 last entries,look ok to me.
>>> df.head(20)
QUANTITYORDERED PRICEEACH SALES ORDERDATE STATUS PRODUCTLINE MSRP ORDERNUMBER PRODUCTCODE
0 50 67.80 3390.00 1/6/2003 0:00 Shipped Vintage Cars 60 10100 S18_2248
1 22 86.51 1903.22 1/6/2003 0:00 Shipped Vintage Cars 92 10100 S18_4409
2 49 34.47 1689.03 1/6/2003 0:00 Shipped Vintage Cars 41 10100 S24_3969
3 45 31.20 1404.00 1/9/2003 0:00 Shipped Vintage Cars 33 10101 S24_1937
4 46 53.76 2472.96 1/9/2003 0:00 Shipped Vintage Cars 44 10101 S24_2022
5 41 50.14 2055.74 1/10/2003 0:00 Shipped Vintage Cars 53 10102 S18_1367
6 22 54.09 1189.98 1/29/2003 0:00 Shipped Trucks and Buses 60 10103 S18_2432
7 27 83.07 2242.89 1/29/2003 0:00 Shipped Vintage Cars 101 10103 S18_2949
8 35 57.46 2011.10 1/29/2003 0:00 Shipped Vintage Cars 62 10103 S18_2957
9 41 47.29 1938.89 1/29/2003 0:00 Shipped Vintage Cars 50 10103 S18_4668
10 45 75.63 3403.35 1/29/2003 0:00 Shipped Trucks and Buses 64 10103 S32_3522
11 35 55.49 1942.15 1/31/2003 0:00 Shipped Classic Cars 57 10104 S24_1444
12 44 39.60 1742.40 1/31/2003 0:00 Shipped Classic Cars 35 10104 S24_2840
13 35 47.62 1666.70 1/31/2003 0:00 Shipped Trucks and Buses 54 10104 S32_2509
14 49 65.87 3227.63 1/31/2003 0:00 Shipped Trains 62 10104 S32_3207
15 32 53.31 1705.92 1/31/2003 0:00 Shipped Trains 58 10104 S50_1514
16 41 82.50 3382.50 2/11/2003 0:00 Shipped Vintage Cars 87 10105 S18_4522
17 44 72.58 3193.52 2/11/2003 0:00 Shipped Vintage Cars 88 10105 S24_3151
18 50 79.67 3983.50 2/11/2003 0:00 Shipped Vintage Cars 83 10105 S24_3816
19 41 70.67 2897.47 2/11/2003 0:00 Shipped Ships 66 10105 S700_1138
>>>
>>>
>>> df.tail(20)
QUANTITYORDERED PRICEEACH SALES ORDERDATE STATUS PRODUCTLINE MSRP ORDERNUMBER PRODUCTCODE
244 55 96.30 5296.50 5/29/2005 0:00 In Process Classic Cars 117 10420 S24_2887
245 35 96.74 3385.90 5/29/2005 0:00 In Process Classic Cars 85 10420 S24_3191
246 15 43.49 652.35 5/29/2005 0:00 In Process Vintage Cars 41 10420 S24_3969
247 40 45.70 1828.00 5/29/2005 0:00 In Process Vintage Cars 44 10421 S24_2022
248 51 95.55 4873.05 5/30/2005 0:00 In Process Vintage Cars 102 10422 S18_1342
249 25 51.75 1293.75 5/30/2005 0:00 In Process Vintage Cars 53 10422 S18_1367
250 10 88.14 881.40 5/30/2005 0:00 In Process Vintage Cars 101 10423 S18_2949
251 31 53.72 1665.32 5/30/2005 0:00 In Process Vintage Cars 62 10423 S18_2957
252 21 84.82 1781.22 5/30/2005 0:00 In Process Vintage Cars 104 10423 S18_3136
253 21 89.29 1875.09 5/30/2005 0:00 In Process Vintage Cars 99 10423 S18_3320
254 28 78.89 2208.92 5/30/2005 0:00 In Process Vintage Cars 97 10423 S24_4258
255 26 59.87 1556.62 5/31/2005 0:00 In Process Vintage Cars 50 10424 S18_4668
256 44 61.41 2702.04 5/31/2005 0:00 In Process Trucks and Buses 64 10424 S32_3522
257 46 80.92 3722.32 5/31/2005 0:00 In Process Classic Cars 101 10424 S700_2824
258 38 99.41 3777.58 5/31/2005 0:00 In Process Trucks and Buses 122 10425 S18_2319
259 19 49.22 935.18 5/31/2005 0:00 In Process Trucks and Buses 60 10425 S18_2432
260 55 46.82 2575.10 5/31/2005 0:00 In Process Classic Cars 57 10425 S24_1444
261 31 33.24 1030.44 5/31/2005 0:00 In Process Classic Cars 35 10425 S24_2840
262 41 86.68 3553.88 5/31/2005 0:00 In Process Trucks and Buses 96 10425 S32_1268
263 11 43.83 482.13 5/31/2005 0:00 In Process Trucks and Buses 54 10425 S32_2509
Posts: 5
Threads: 2
Joined: May 2018
(Aug-12-2022, 04:15 PM)snippsat Wrote: Test of code using your files.
import json
import os
import pandas as pd
pd.set_option('display.max_column', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_seq_items', None)
pd.set_option('display.max_colwidth', 500)
pd.set_option('expand_frame_repr', False)
path_to_json = r'G:\div_code\JsonFiles\\'
def file_path(path_to_json):
json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
return json_files
df = pd.DataFrame()
for jf in file_path(path_to_json):
with open(jf,'r') as f:
data = json.loads(f.read())
# Flatten data
df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
df = df.append(df_nested_list) Look at 20 first and 20 last entries,look ok to me.
>>> df.head(20)
QUANTITYORDERED PRICEEACH SALES ORDERDATE STATUS PRODUCTLINE MSRP ORDERNUMBER PRODUCTCODE
0 50 67.80 3390.00 1/6/2003 0:00 Shipped Vintage Cars 60 10100 S18_2248
1 22 86.51 1903.22 1/6/2003 0:00 Shipped Vintage Cars 92 10100 S18_4409
2 49 34.47 1689.03 1/6/2003 0:00 Shipped Vintage Cars 41 10100 S24_3969
3 45 31.20 1404.00 1/9/2003 0:00 Shipped Vintage Cars 33 10101 S24_1937
4 46 53.76 2472.96 1/9/2003 0:00 Shipped Vintage Cars 44 10101 S24_2022
5 41 50.14 2055.74 1/10/2003 0:00 Shipped Vintage Cars 53 10102 S18_1367
6 22 54.09 1189.98 1/29/2003 0:00 Shipped Trucks and Buses 60 10103 S18_2432
7 27 83.07 2242.89 1/29/2003 0:00 Shipped Vintage Cars 101 10103 S18_2949
8 35 57.46 2011.10 1/29/2003 0:00 Shipped Vintage Cars 62 10103 S18_2957
9 41 47.29 1938.89 1/29/2003 0:00 Shipped Vintage Cars 50 10103 S18_4668
10 45 75.63 3403.35 1/29/2003 0:00 Shipped Trucks and Buses 64 10103 S32_3522
11 35 55.49 1942.15 1/31/2003 0:00 Shipped Classic Cars 57 10104 S24_1444
12 44 39.60 1742.40 1/31/2003 0:00 Shipped Classic Cars 35 10104 S24_2840
13 35 47.62 1666.70 1/31/2003 0:00 Shipped Trucks and Buses 54 10104 S32_2509
14 49 65.87 3227.63 1/31/2003 0:00 Shipped Trains 62 10104 S32_3207
15 32 53.31 1705.92 1/31/2003 0:00 Shipped Trains 58 10104 S50_1514
16 41 82.50 3382.50 2/11/2003 0:00 Shipped Vintage Cars 87 10105 S18_4522
17 44 72.58 3193.52 2/11/2003 0:00 Shipped Vintage Cars 88 10105 S24_3151
18 50 79.67 3983.50 2/11/2003 0:00 Shipped Vintage Cars 83 10105 S24_3816
19 41 70.67 2897.47 2/11/2003 0:00 Shipped Ships 66 10105 S700_1138
>>>
>>>
>>> df.tail(20)
QUANTITYORDERED PRICEEACH SALES ORDERDATE STATUS PRODUCTLINE MSRP ORDERNUMBER PRODUCTCODE
244 55 96.30 5296.50 5/29/2005 0:00 In Process Classic Cars 117 10420 S24_2887
245 35 96.74 3385.90 5/29/2005 0:00 In Process Classic Cars 85 10420 S24_3191
246 15 43.49 652.35 5/29/2005 0:00 In Process Vintage Cars 41 10420 S24_3969
247 40 45.70 1828.00 5/29/2005 0:00 In Process Vintage Cars 44 10421 S24_2022
248 51 95.55 4873.05 5/30/2005 0:00 In Process Vintage Cars 102 10422 S18_1342
249 25 51.75 1293.75 5/30/2005 0:00 In Process Vintage Cars 53 10422 S18_1367
250 10 88.14 881.40 5/30/2005 0:00 In Process Vintage Cars 101 10423 S18_2949
251 31 53.72 1665.32 5/30/2005 0:00 In Process Vintage Cars 62 10423 S18_2957
252 21 84.82 1781.22 5/30/2005 0:00 In Process Vintage Cars 104 10423 S18_3136
253 21 89.29 1875.09 5/30/2005 0:00 In Process Vintage Cars 99 10423 S18_3320
254 28 78.89 2208.92 5/30/2005 0:00 In Process Vintage Cars 97 10423 S24_4258
255 26 59.87 1556.62 5/31/2005 0:00 In Process Vintage Cars 50 10424 S18_4668
256 44 61.41 2702.04 5/31/2005 0:00 In Process Trucks and Buses 64 10424 S32_3522
257 46 80.92 3722.32 5/31/2005 0:00 In Process Classic Cars 101 10424 S700_2824
258 38 99.41 3777.58 5/31/2005 0:00 In Process Trucks and Buses 122 10425 S18_2319
259 19 49.22 935.18 5/31/2005 0:00 In Process Trucks and Buses 60 10425 S18_2432
260 55 46.82 2575.10 5/31/2005 0:00 In Process Classic Cars 57 10425 S24_1444
261 31 33.24 1030.44 5/31/2005 0:00 In Process Classic Cars 35 10425 S24_2840
262 41 86.68 3553.88 5/31/2005 0:00 In Process Trucks and Buses 96 10425 S32_1268
263 11 43.83 482.13 5/31/2005 0:00 In Process Trucks and Buses 54 10425 S32_2509
Amazing.. Thank you very much snippsat. It has worked as expected. You have saved my day.
|