I have flat csv data loaded into Data frame and trying to build nested json. I was able to build a nested json with orders as list but not able to add few columns like (geo, attributes) as dict within dict (json) and need some pointers on it.
Sample Data -
clientid requestid geo currency date orderid amount quantity attribute1 attribute2
X1 req1 AMER USD 16427166560 1111 600 1 Product A SKU-a
X1 req1 AMER USD 16427166500 1112 1200 1 Product B SKU-b
X1 req1 AMER CAD 16427167500 1113 1500 1 Product A SKU-a
X2 req2 EMEA EUR 16427163500 1114 1000 1 Product A SKU-a
Json O/P expected format -
Python Data Frame
Sample Data -
clientid requestid geo currency date orderid amount quantity attribute1 attribute2
X1 req1 AMER USD 16427166560 1111 600 1 Product A SKU-a
X1 req1 AMER USD 16427166500 1112 1200 1 Product B SKU-b
X1 req1 AMER CAD 16427167500 1113 1500 1 Product A SKU-a
X2 req2 EMEA EUR 16427163500 1114 1000 1 Product A SKU-a
Json O/P expected format -
Output:{
"request": {
"clientid": "X1",
"requestid": "req1",
"businessunit": {
"geo": "AMER"
}
"currency": "USD",
"orders": [{
"date": 16427166560,
"order_id": "1111",
"amount": "600",
"attributes": {
"attribute1": "Product A",
"attribute2": "SKU-a"
}
},
{
"date": 16427166500,
"order_id": "1112",
"amount": "1200",
"attributes": {
"attribute1": "Product B",
"attribute2": "SKU-b"
}
}
]
}
}
Steps to reproduce Python Data Frame
df = pd.DataFrame([['X1', 'req1', 'AMER', 'USD', 16427166560, '1111', 600, '1', 'Product A', 'SKU-a'], ['X1', 'req1', 'AMER', 'USD', 16427166500, '1112', 1200, '1', 'Product B', 'SKU-b'], ['X1', 'req1', 'AMER', 'CAD', 16427167500, '1113', 1500, '1', 'Product A', 'SKU-a'], ['X2', 'req2', 'EMEA', 'EUR', 16427163500, '1114', 1000, '1', 'Product A', 'SKU-a'] ], columns = ['clientid', 'requestid', 'geo', 'currency', 'date', 'order_id', 'amount', 'quantity', 'attribute1', 'attribute2'])Initial draft which builds nested json -
j = (df.groupby(['clientid', 'requestid', 'geo', 'currency']) .apply(lambda x: x[['date', 'order_id', 'amount']].to_dict('records')) .reset_index() .rename(columns={0: 'orders'}) .to_json(orient='records'))Good to have - If we can get separate json file as O/P for each geo/currency in single shot I would appreciate pointers on it.