Python Forum
Load multiple Jason data in one Data Frame
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Load multiple Jason data in one Data Frame
#1
Hi All

I need your guidance :

I have three Jason files which in my local drive which need to be loaded in one data frame so that I can do the further exploratory data analysis. I have written a code but this only loads last file in the data frame . Kindly please find the below code for reference.

I have also attached the Json files and notebook code screen shot.

import json
import pandas as pd
import os

path_to_json ='C://JsonFiles//'

def file_path(path_to_json):
    json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
    return  json_files

for jf in file_path(path_to_json):
    with open(jf,'r') as f:
        data = json.loads(f.read())
        # Flatten data
        df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
        

df_nested_list  

Attached Files

Thumbnail(s)
   

.zip   JsonFiles.zip (Size: 38.4 KB / Downloads: 93)
Reply
#2
Try:
import pandas as pd
import os
from pathlib import Path


jpath = Path('C://JsonFiles//')
filelist = [filename for filename in jpath.iterdir()
    if filename.is_file() and filename.suffix == '.json']

df = pd.DataFrame([pd.read_json(file, typ="series") for file in filelist])
print(df)
Reply
#3
(Aug-12-2022, 09:11 AM)Larz60+ Wrote: Try:
import pandas as pd
import os
from pathlib import Path


jpath = Path('C://JsonFiles//')
filelist = [filename for filename in jpath.iterdir()
    if filename.is_file() and filename.suffix == '.json']

df = pd.DataFrame([pd.read_json(file, typ="series") for file in filelist])
print(df)


Thanks Larz, but it does not work what I am expecting. If you look at my json files and review my code, I am trying to load those into one single dataframe with correct column names (I have attached one more screen with the data frame columns). json_normalize function and meta columns are used to extract the data hierarchical data from the Json.

My code works fine but it does not load all the given file's data in DF , only loads last one. Kindly please help to twick my code so that all files loaded in one DF.

Attached Files

Thumbnail(s)
   
Reply
#4
Try.
import json
import pandas as pd
import os

path_to_json ='C://JsonFiles//'

def file_path(path_to_json):
    json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
    return  json_files

df = pd.DataFrame()
for jf in file_path(path_to_json):
    with open(jf,'r') as f:
        data = json.loads(f.read())
        # Flatten data
        df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
        df = df.append(df_nested_list)
Reply
#5
(Aug-12-2022, 01:29 PM)snippsat Wrote: Try.
import json
import pandas as pd
import os

path_to_json ='C://JsonFiles//'

def file_path(path_to_json):
    json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
    return  json_files

df = pd.DataFrame()
for jf in file_path(path_to_json):
    with open(jf,'r') as f:
        data = json.loads(f.read())
        # Flatten data
        df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
        df = df.append(df_nested_list)

Hi snippsat

It does not load the desired data. I have attached the screen shot what I am looking for and the result from new code

Attached Files

Thumbnail(s)
       
Reply
#6
Test of code using your files.
import json
import os
import pandas as pd
pd.set_option('display.max_column', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_seq_items', None)
pd.set_option('display.max_colwidth', 500)
pd.set_option('expand_frame_repr', False)

path_to_json = r'G:\div_code\JsonFiles\\'
def file_path(path_to_json):
    json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
    return  json_files

df = pd.DataFrame()
for jf in file_path(path_to_json):
    with open(jf,'r') as f:
        data = json.loads(f.read())
        # Flatten data
        df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
        df = df.append(df_nested_list)
Look at 20 first and 20 last entries,look ok to me.
>>> df.head(20)
    QUANTITYORDERED  PRICEEACH    SALES       ORDERDATE   STATUS       PRODUCTLINE  MSRP ORDERNUMBER PRODUCTCODE
0                50      67.80  3390.00   1/6/2003 0:00  Shipped      Vintage Cars    60       10100    S18_2248
1                22      86.51  1903.22   1/6/2003 0:00  Shipped      Vintage Cars    92       10100    S18_4409
2                49      34.47  1689.03   1/6/2003 0:00  Shipped      Vintage Cars    41       10100    S24_3969
3                45      31.20  1404.00   1/9/2003 0:00  Shipped      Vintage Cars    33       10101    S24_1937
4                46      53.76  2472.96   1/9/2003 0:00  Shipped      Vintage Cars    44       10101    S24_2022
5                41      50.14  2055.74  1/10/2003 0:00  Shipped      Vintage Cars    53       10102    S18_1367
6                22      54.09  1189.98  1/29/2003 0:00  Shipped  Trucks and Buses    60       10103    S18_2432
7                27      83.07  2242.89  1/29/2003 0:00  Shipped      Vintage Cars   101       10103    S18_2949
8                35      57.46  2011.10  1/29/2003 0:00  Shipped      Vintage Cars    62       10103    S18_2957
9                41      47.29  1938.89  1/29/2003 0:00  Shipped      Vintage Cars    50       10103    S18_4668
10               45      75.63  3403.35  1/29/2003 0:00  Shipped  Trucks and Buses    64       10103    S32_3522
11               35      55.49  1942.15  1/31/2003 0:00  Shipped      Classic Cars    57       10104    S24_1444
12               44      39.60  1742.40  1/31/2003 0:00  Shipped      Classic Cars    35       10104    S24_2840
13               35      47.62  1666.70  1/31/2003 0:00  Shipped  Trucks and Buses    54       10104    S32_2509
14               49      65.87  3227.63  1/31/2003 0:00  Shipped            Trains    62       10104    S32_3207
15               32      53.31  1705.92  1/31/2003 0:00  Shipped            Trains    58       10104    S50_1514
16               41      82.50  3382.50  2/11/2003 0:00  Shipped      Vintage Cars    87       10105    S18_4522
17               44      72.58  3193.52  2/11/2003 0:00  Shipped      Vintage Cars    88       10105    S24_3151
18               50      79.67  3983.50  2/11/2003 0:00  Shipped      Vintage Cars    83       10105    S24_3816
19               41      70.67  2897.47  2/11/2003 0:00  Shipped             Ships    66       10105   S700_1138
>>> 
>>> 
>>> df.tail(20)
     QUANTITYORDERED  PRICEEACH    SALES       ORDERDATE      STATUS       PRODUCTLINE  MSRP ORDERNUMBER PRODUCTCODE
244               55      96.30  5296.50  5/29/2005 0:00  In Process      Classic Cars   117       10420    S24_2887
245               35      96.74  3385.90  5/29/2005 0:00  In Process      Classic Cars    85       10420    S24_3191
246               15      43.49   652.35  5/29/2005 0:00  In Process      Vintage Cars    41       10420    S24_3969
247               40      45.70  1828.00  5/29/2005 0:00  In Process      Vintage Cars    44       10421    S24_2022
248               51      95.55  4873.05  5/30/2005 0:00  In Process      Vintage Cars   102       10422    S18_1342
249               25      51.75  1293.75  5/30/2005 0:00  In Process      Vintage Cars    53       10422    S18_1367
250               10      88.14   881.40  5/30/2005 0:00  In Process      Vintage Cars   101       10423    S18_2949
251               31      53.72  1665.32  5/30/2005 0:00  In Process      Vintage Cars    62       10423    S18_2957
252               21      84.82  1781.22  5/30/2005 0:00  In Process      Vintage Cars   104       10423    S18_3136
253               21      89.29  1875.09  5/30/2005 0:00  In Process      Vintage Cars    99       10423    S18_3320
254               28      78.89  2208.92  5/30/2005 0:00  In Process      Vintage Cars    97       10423    S24_4258
255               26      59.87  1556.62  5/31/2005 0:00  In Process      Vintage Cars    50       10424    S18_4668
256               44      61.41  2702.04  5/31/2005 0:00  In Process  Trucks and Buses    64       10424    S32_3522
257               46      80.92  3722.32  5/31/2005 0:00  In Process      Classic Cars   101       10424   S700_2824
258               38      99.41  3777.58  5/31/2005 0:00  In Process  Trucks and Buses   122       10425    S18_2319
259               19      49.22   935.18  5/31/2005 0:00  In Process  Trucks and Buses    60       10425    S18_2432
260               55      46.82  2575.10  5/31/2005 0:00  In Process      Classic Cars    57       10425    S24_1444
261               31      33.24  1030.44  5/31/2005 0:00  In Process      Classic Cars    35       10425    S24_2840
262               41      86.68  3553.88  5/31/2005 0:00  In Process  Trucks and Buses    96       10425    S32_1268
263               11      43.83   482.13  5/31/2005 0:00  In Process  Trucks and Buses    54       10425    S32_2509
Reply
#7
(Aug-12-2022, 04:15 PM)snippsat Wrote: Test of code using your files.
import json
import os
import pandas as pd
pd.set_option('display.max_column', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_seq_items', None)
pd.set_option('display.max_colwidth', 500)
pd.set_option('expand_frame_repr', False)

path_to_json = r'G:\div_code\JsonFiles\\'
def file_path(path_to_json):
    json_files = [path_to_json + pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
    return  json_files

df = pd.DataFrame()
for jf in file_path(path_to_json):
    with open(jf,'r') as f:
        data = json.loads(f.read())
        # Flatten data
        df_nested_list = pd.json_normalize( data, record_path =['attributes'], meta=['ORDERNUMBER', 'PRODUCTCODE'])
        df = df.append(df_nested_list)
Look at 20 first and 20 last entries,look ok to me.
>>> df.head(20)
    QUANTITYORDERED  PRICEEACH    SALES       ORDERDATE   STATUS       PRODUCTLINE  MSRP ORDERNUMBER PRODUCTCODE
0                50      67.80  3390.00   1/6/2003 0:00  Shipped      Vintage Cars    60       10100    S18_2248
1                22      86.51  1903.22   1/6/2003 0:00  Shipped      Vintage Cars    92       10100    S18_4409
2                49      34.47  1689.03   1/6/2003 0:00  Shipped      Vintage Cars    41       10100    S24_3969
3                45      31.20  1404.00   1/9/2003 0:00  Shipped      Vintage Cars    33       10101    S24_1937
4                46      53.76  2472.96   1/9/2003 0:00  Shipped      Vintage Cars    44       10101    S24_2022
5                41      50.14  2055.74  1/10/2003 0:00  Shipped      Vintage Cars    53       10102    S18_1367
6                22      54.09  1189.98  1/29/2003 0:00  Shipped  Trucks and Buses    60       10103    S18_2432
7                27      83.07  2242.89  1/29/2003 0:00  Shipped      Vintage Cars   101       10103    S18_2949
8                35      57.46  2011.10  1/29/2003 0:00  Shipped      Vintage Cars    62       10103    S18_2957
9                41      47.29  1938.89  1/29/2003 0:00  Shipped      Vintage Cars    50       10103    S18_4668
10               45      75.63  3403.35  1/29/2003 0:00  Shipped  Trucks and Buses    64       10103    S32_3522
11               35      55.49  1942.15  1/31/2003 0:00  Shipped      Classic Cars    57       10104    S24_1444
12               44      39.60  1742.40  1/31/2003 0:00  Shipped      Classic Cars    35       10104    S24_2840
13               35      47.62  1666.70  1/31/2003 0:00  Shipped  Trucks and Buses    54       10104    S32_2509
14               49      65.87  3227.63  1/31/2003 0:00  Shipped            Trains    62       10104    S32_3207
15               32      53.31  1705.92  1/31/2003 0:00  Shipped            Trains    58       10104    S50_1514
16               41      82.50  3382.50  2/11/2003 0:00  Shipped      Vintage Cars    87       10105    S18_4522
17               44      72.58  3193.52  2/11/2003 0:00  Shipped      Vintage Cars    88       10105    S24_3151
18               50      79.67  3983.50  2/11/2003 0:00  Shipped      Vintage Cars    83       10105    S24_3816
19               41      70.67  2897.47  2/11/2003 0:00  Shipped             Ships    66       10105   S700_1138
>>> 
>>> 
>>> df.tail(20)
     QUANTITYORDERED  PRICEEACH    SALES       ORDERDATE      STATUS       PRODUCTLINE  MSRP ORDERNUMBER PRODUCTCODE
244               55      96.30  5296.50  5/29/2005 0:00  In Process      Classic Cars   117       10420    S24_2887
245               35      96.74  3385.90  5/29/2005 0:00  In Process      Classic Cars    85       10420    S24_3191
246               15      43.49   652.35  5/29/2005 0:00  In Process      Vintage Cars    41       10420    S24_3969
247               40      45.70  1828.00  5/29/2005 0:00  In Process      Vintage Cars    44       10421    S24_2022
248               51      95.55  4873.05  5/30/2005 0:00  In Process      Vintage Cars   102       10422    S18_1342
249               25      51.75  1293.75  5/30/2005 0:00  In Process      Vintage Cars    53       10422    S18_1367
250               10      88.14   881.40  5/30/2005 0:00  In Process      Vintage Cars   101       10423    S18_2949
251               31      53.72  1665.32  5/30/2005 0:00  In Process      Vintage Cars    62       10423    S18_2957
252               21      84.82  1781.22  5/30/2005 0:00  In Process      Vintage Cars   104       10423    S18_3136
253               21      89.29  1875.09  5/30/2005 0:00  In Process      Vintage Cars    99       10423    S18_3320
254               28      78.89  2208.92  5/30/2005 0:00  In Process      Vintage Cars    97       10423    S24_4258
255               26      59.87  1556.62  5/31/2005 0:00  In Process      Vintage Cars    50       10424    S18_4668
256               44      61.41  2702.04  5/31/2005 0:00  In Process  Trucks and Buses    64       10424    S32_3522
257               46      80.92  3722.32  5/31/2005 0:00  In Process      Classic Cars   101       10424   S700_2824
258               38      99.41  3777.58  5/31/2005 0:00  In Process  Trucks and Buses   122       10425    S18_2319
259               19      49.22   935.18  5/31/2005 0:00  In Process  Trucks and Buses    60       10425    S18_2432
260               55      46.82  2575.10  5/31/2005 0:00  In Process      Classic Cars    57       10425    S24_1444
261               31      33.24  1030.44  5/31/2005 0:00  In Process      Classic Cars    35       10425    S24_2840
262               41      86.68  3553.88  5/31/2005 0:00  In Process  Trucks and Buses    96       10425    S32_1268
263               11      43.83   482.13  5/31/2005 0:00  In Process  Trucks and Buses    54       10425    S32_2509


Amazing.. Thank you very much snippsat. It has worked as expected. You have saved my day.
snippsat likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help with to check an Input list data with a data read from an external source sacharyya 3 317 Mar-09-2024, 12:33 PM
Last Post: Pedroski55
  Load data in Oracle muzokh 0 190 Mar-08-2024, 11:19 PM
Last Post: muzokh
  Using pyodbc&pandas to load a Table data to df tester_V 3 746 Sep-09-2023, 08:55 PM
Last Post: tester_V
  how do you style data frame that has empty rows. gsaray101 0 498 Sep-08-2023, 05:20 PM
Last Post: gsaray101
  googletrans library to translate text language for using data frame is not running gcozba2023 0 1,163 Mar-06-2023, 09:50 AM
Last Post: gcozba2023
  Write sql data or CSV Data into parquet file mg24 2 2,356 Sep-26-2022, 08:21 AM
Last Post: ibreeden
  conditionals based on data frame mbrown009 1 873 Aug-12-2022, 08:18 AM
Last Post: Larz60+
  Issue in changing data format (2 bytes) into a 16 bit data. GiggsB 11 2,560 Jul-25-2022, 03:19 PM
Last Post: deanhystad
  Merging two Data Frame on a special case piku9290dgp 0 1,069 Mar-02-2022, 10:43 AM
Last Post: piku9290dgp
  OCR-Python from Multi TIFF to HOCR getting only Data from 1st Page of multiple TIFF JOE 0 2,120 Feb-18-2022, 03:18 PM
Last Post: JOE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020