Python Forum
parse json field from csv file - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: parse json field from csv file (/thread-41120.html)



parse json field from csv file - lebossejames - Nov-14-2023

Hello,

I want to get from csv file a column contain string json, so i use theses command:
fichier_complet=pd.read_csv("outputfile.csv", sep = ',',dtype='unicode',usecols = ['body'],keep_default_na=False)

print(fichier_complet.values[0])
json_string = json.dumps(fichier_complet.values[0])


The print display string with [' and '] at the beginning and end of value of cell csv file.

I want to get a json object which i can get a value from a different key.

How can i do that please ?
Thank you.


RE: parse json field from csv file - buran - Nov-14-2023

Please, provide sample of the input file
Also, don't post images of code/data/errors, etc. Post as text with proper BBCode tags


RE: parse json field from csv file - lebossejames - Nov-14-2023

(Nov-14-2023, 11:16 AM)buran Wrote: Please, provide sample of the input file
Also, don't post images of code/data/errors, etc. Post as text with proper BBCode tags

Yes this input file

I want get "idAffaire" value then "code" value from prestation object


RE: parse json field from csv file - Pedroski55 - Nov-14-2023

I don't know how to use pandas very well.

Hard to know what starts where and what ends where in the json.

I thinks this gets what you want.

Look carefully at the output of

for item in json_data.items():
    print(json.dumps(item, indent = 4, sort_keys=True))
Sub keys are further right than leading keys. Look for lists as well!

import json
import csv

myfile = '/home/pedro/myPython/json/csv/outputfile.csv'
with open(myfile) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    data = []
    for row in csv_reader:
        data.append(row)
for d in data:
    print(d, len(d)

# get the last element of data[1] this is the json
json_data = json.loads(data[1][16])

for key in json_data.keys():
    print(json.dumps(key, indent = 4, sort_keys=True))
for item in json_data.items():
    print(json.dumps(item, indent = 4, sort_keys=True))

print(json_data["hits"]['hits'][0]['_source']['idAffaire'])
print(json_data["hits"]['hits'][0]['_source']['demande']['prestation']['code'])



RE: parse json field from csv file - snippsat - Nov-14-2023

This is a JSON response from an API request.
The request is named "RFE" in the collection "Web_test" and it's a GET request to the URL ......
So should not be saved as .csv but .json then keep the correct data structure.
If you do not have control how to get data from the API request and just have the .csv file then have to parse as shown by Pedroski55.
Or like this this.
import csv
import json

csv_file_path = 'data.csv'
json_output = []
with open(csv_file_path, mode='r', encoding='utf-8') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        if row['body']:
            row['body'] = json.loads(row['body'])
        json_output.append(row)
json_data = json.dumps(json_output, indent=2)
print(json_data)

with open('output.json', 'w', encoding='utf-8') as json_file:
    json_file.write(json_data)
Output:
[ { "iteration": "1", "collectionName": "Web_test", "requestName": "RFE", "method": "GET", "url": "https://indexa.com/_search", "status": "OK", "code": "200", "responseTime": "509", "responseSize": "1270", "executed": "", "failed": "", "skipped": "", "totalAssertions": "0", "executedCount": "0", "failedCount": "0", "skippedCount": "0", "body": { "took": 152, "timed_out": false, "_shards": { "total": 3, "successful": 3, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 15.476334, "hits": [ { "_index": "dossiers-bui1_v7", "_type": "_doc", "_id": "A07", "_score": 15.476334, "_source": { "idAffaire": "A07", "idContrat": "F003", "donneesPoint": { "codePostal": "27700", "commune": { "libelle": "HARQUENCY" }, "departement": "27" }, "statut": "COURS", "demande": { "prestation": { "code": "F8", "valeur": "Rac" }, "dateCreation": "2023-05-14T22:00:00.000+00:00", "client": { "categorieClient": "RES", "personnePhysique": { "nom": "MOUSS" }, "personneMorale": {} }, "dateModification": "2023-09-25T22:00:00.000+00:00", "dateEffet": "2023-05-21T22:00:00.000+00:00", "motifCloture": {}, "initiateur": { "login": "[email protected]", "acteurAppartenance": { "codeAcm": "ACM_7", "libelle": "B" } }, "affaireEtat": { "libelle": "Demande re\u00e7ue", "code": "DMRI" }, "affaireEtatExterne": { "libelle": "Demande transmise", "code": "DMTR" }, "affaireAvecRelance": "false", "interventionEnCours": false }, "segment": "C", "acteurMarche": { "codeAcm": "ACM_", "libelle": "BCM " }, "frnSortant": {}, "applicationSource": "AD", "recevabilite": { "prestations": { "prestation": { "option": {} } } } } } ] } } } ]