Python Forum
How to compare two json and write to third json differences with pandas and numpy - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: How to compare two json and write to third json differences with pandas and numpy (/thread-28566.html)



How to compare two json and write to third json differences with pandas and numpy - onenessboy - Jul-24-2020

Hi

I am trying to compare two json and then write another json with columns names and with differences as yes or no. I am using pandas and numpy

Input files: And these input files are dynamic, for example this below example file has only two keys, where are ohter files i have may dynamic number of keys. so requirement is to loop all columns and then compare and write to json.

fut.json

[
    {
        "AlarmName": "test",
        "StateValue": "OK"
    }
]

Curr.json:

[
    {
        "AlarmName": "test",
        "StateValue": "OK"
    }
]
Below code I have tried:

    import pandas as pd
    import numpy as np

    with open(r"c:\csv\fut.json", 'r+') as f:
        data_b = json.load(f)
    with open(r"c:\csv\curr.json", 'r+') as f:
        data_a = json.load(f)
    df_a = pd.json_normalize(data_a)
    df_b = pd.json_normalize(data_b)
    
    _, df_a = df_b.align(df_a, fill_value=np.NaN)
    _, df_b = df_a.align(df_b, fill_value=np.NaN)
    
    with open(r"c:\csv\report.json", 'w') as _file:
        for col in df_a.columns:
            df_temp = pd.DataFrame()
            df_temp[col + '_curr'], df_temp[col + '_fut'], df_temp[col + '_diff'] = df_a[col], df_b[col], np.where((df_a[col] == df_b[col]), 'No', 'Yes')
            #[df_temp.rename(columns={c:'Missing'}, inplace=True) for c in df_temp.columns if df_temp[c].isnull().all()]
            df_temp.fillna('Missing', inplace=True)
            with pd.option_context('display.max_colwidth', -1):
                _file.write(df_temp.to_json(orient='records'))
 
Expected output:

[
    {
        "AlarmName_curr": "test",
        "AlarmName_fut": "test",
        "AlarmName_diff": "No"
    },
    {
        "StateValue_curr": "OK",
        "StateValue_fut": "OK",
        "StateValue_diff": "No"
    }
]
Coming output: Not able to parse it in json validator, below is the problem, those [] should be replaed by ',' to get right json dont know why its printing like that

[{"AlarmName_curr":"test","AlarmName_fut":"test","AlarmName_diff":"No"}][{"StateValue_curr":"OK","StateValue_fut":"OK","StateValue_diff":"No"}]
Tried below as well

_file.write(df_temp.to_json(orient='records',lines=True))
now i get json which is again not parsable, ',' is missing and unless i add , between two dic and [ ] at beginning and end manually , its not parsing..

[{"AlarmName_curr":"test","AlarmName_fut":"test","AlarmName_diff":"No"}{"StateValue_curr":"OK","StateValue_fut":"OK","StateValue_diff":"No"}]
any help will be highly appreciated.. thanks in advance