Posts: 4
Threads: 2
Joined: Jan 2018
Hi,
I have a json file and I want to convert it to csv. What I found is it depends on json format.
JSON file contains:
{"all-services": [
{
"Message" : "Everything is ok",
"Priority" : 7,
"SystemDUnit" : "Service1",
"Timestamp" : 1586858160
},
{
"Message" : "Something is ok",
"Priority" : 7,
"SystemDUnit" : "Service1",
"Timestamp" : 1586858160
},
{
"Message" : "Something might not be ok",
"Priority" : 6,
"SystemDUnit" : "Service1",
"Timestamp" : 1586858160
},
{
"Message" : "Is everything ok?",
"Priority" : 6,
"SystemDUnit" : "Service1",
"Timestamp" : 1586858160
}
]} How can I convert it to csv format and save it?
Thanks.
Posts: 1,838
Threads: 2
Joined: Apr 2017
What have you tried? If you aren't aware, the standard library contains modules for processing JSON and CSV (unimaginatively, they're called json and csv , respectively). See the standard library docs.
Posts: 8,160
Threads: 160
Joined: Sep 2016
as an alternative to csv module - look at pandas. you will need to normalize part of data you will read with json module
Posts: 4
Threads: 2
Joined: Jan 2018
May-03-2020, 09:09 PM
(This post was last modified: May-03-2020, 09:09 PM by kahlenberg.)
Thanks for answers. I forgot to include my code, what I did.
So here is my code. Of course it is not yet complete.
import csv, json, sys
if sys.argv[1] is not None and sys.argv[2] is not None:
fileInput = sys.argv[1]
fileOutput = sys.argv[2]
headers = []
with open(fileInput, 'r') as json_file:
data = json.load(json_file)
print(data)
outputFile = open(fileOutput, 'w') #load csv file
output = csv.writer(outputFile)
for service_name in data.keys():
print(service_name)
for value in data.values():
for header in value[0].keys():
print(header)
headers.append(header)
print(headers) I can print headers (columns) but I didn't like my code :) I know, there must be a better way.
Posts: 8,160
Threads: 160
Joined: Sep 2016
May-04-2020, 05:22 AM
(This post was last modified: May-04-2020, 05:22 AM by buran.)
sys.argv[1] and/or sys.argv[2] will never be None. In case it is missing, you will get IndexError the moment when you try to access element with non-existing index in sys.argv. You may look at click for creating nice cli interface. Or at least at argparse module from standard library.
import json
import sys
import csv
import pandas as pd
try:
_, json_file, csv_file = sys.argv
except ValueError:
print(f'Incorrect number of CLI arguments.\nUse: yourscript.py json_file csv_file')
else:
with open(json_file) as f:
data = json.load(f)
# using csv
with open(csv_file, 'w') as f:
wrtr = csv.DictWriter(f, fieldnames=data['all-services'][0].keys())
wrtr.writeheader()
wrtr.writerows(data['all-services'])
# alternative - usnig pandas
df = pd.json_normalize(data['all-services'])
df.to_csv(csv_file, index=False)
Posts: 4
Threads: 2
Joined: Jan 2018
Thank you. I didn't test it yet but it should work.
Another question :)
I don't want to include 'all-services' in the code and maybe there will be more arrays than one, so for example:
{
"all-services": [
{..},
{..},
{..}
],
"other-service" : [
{..},
{..},
{..}
]
} How can I iterate all arrays? The name of arrays are not important, because those names will be already in the "SysteDUnit" field.
Second question:
I want to change the order of header. For example, instead "Message","Priority".. like in json, I want to have "Timestamp","SystemDUnit","Priority","Message".
How can I do it?
Posts: 8,160
Threads: 160
Joined: Sep 2016
May-04-2020, 09:41 AM
(This post was last modified: May-04-2020, 09:42 AM by buran.)
try to do it your self. open the file to write, write header then iterate over data.values() - this will yield each service list.
as to the filednames - because the new order is completely reversed - just reverse the list
wrtr = csv.DictWriter(f, fieldnames=list(data.values()[0][0].keys())[::-1]) if you want completely custom fieldnames order - just pass a list/tuple with filednames in order you want.
|