Python Forum
Object madness - JSON Notation confusion
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Object madness - JSON Notation confusion
#1
Good day folks. I am still a novice python programmer but got an urgent work task assigned that python is the answer to :) So, here I go.

I have the xml output from a commercial software product that I have transferred to JSON via XML2JSON. My goal is to locate and extract some specific strings of data. Under most circumstances, I am finding what I am needing. However, in certain cases, there are multiple rows of data apparently in a list of nested objects and I am struggling to code to extract them. My basic issue is how to syntactically reference them. i.e. I can't get the syntax to reference the nested elements.

The data example:

This node is successfully parsed:

"Source": {
"@ID": "11",
"@HostID": "0",
"@Type": "FileSystem",
"@Path": "D:\\PROD\\Product\\Customer\\FileType\\",
"@FileMask": "*.*",
"@DeleteOrig": "1",
"@NewFilesOnly": "0",
"@SearchSubdirs": "0",
"@Unzip": "0",
"@RetryIfNoFiles": "0",
"@UseDefRetryCount": "1",
"@UseDefRetryTimeoutSecs": "1",
"@UseDefRescanSecs": "1",
"@UDMxFi": "1",
"@UDMxBy": "1",
"@ExFo": "Archive",
"Criteria": {
"comp": {
"@a": "[FileDateStamp]",
"@test": "DLT",
"@b": "[DateSubtract([Now],12H)]"
}
}
},

It is parsed using this syntax:
				taskSourcePattern = row['Source'].get('@FileMask','* Source Pattern Not Found *')						
Under certain circumstances, there are multiple nodes in the 'Source' object:

"Source": [
{
"@ID": "11",
"@HostID": "0",
"@Type": "FileSystem",
"@Path": "D:\\PRODUCT\\CLIENT\\Outgoing",
"@FileMask": "*.*",
"@DeleteOrig": "1",
"@NewFilesOnly": "0",
"@SearchSubdirs": "0",
"@Unzip": "0",
"@RetryIfNoFiles": "0",
"@UseDefRetryCount": "1",
"@UseDefRetryTimeoutSecs": "1",
"@UseDefRescanSecs": "1",
"@UDMxFi": "1",
"@UDMxBy": "1"
},
{
"@ID": "17",
"@HostID": "0",
"@Type": "FileSystem",
"@Path": "D:\\PRODUCT\\CLIENT\\Outgoing2",
"@FileMask": "*.*",
"@DeleteOrig": "1",
"@NewFilesOnly": "0",
"@SearchSubdirs": "0",
"@Unzip": "0",
"@RetryIfNoFiles": "0",
"@UseDefRetryCount": "1",
"@UseDefRetryTimeoutSecs": "1",
"@UseDefRescanSecs": "1",
"@UDMxFi": "1",
"@UDMxBy": "1"
}
],
In this case, there is an extra '[' and ']' at the beginning and end of the list. As I iterate through this how do I handle the sudden inclusion of multiple nested records? If you look at my other forum post for another project I am working on, I am basically having the same issue. It's a learning problem :) Maybe you can teach me to fish.

Here is the full code list in this very early build:

import json
import sys

with open('output.json') as json_file:  
	data = json.load(json_file)	
	
	for row in data['Exported']['Tasks']['Task']:
		taskName = row['@Name']
		taskID = row['@ID']	
		print(taskID,"\t",taskName)

		if row.get('Source'):			
# Note the defensive code - the value of 2 indicates I'm about to die
			if len(row.get('Source')) != 2:	
				taskSourcePattern = row['Source'].get('@FileMask','* Source Pattern Not Found *')						
				if row['Source'].get('@Path'):			
					taskSourceFolder =  row['Source'].get('@Path','* Source Folder Not Found *')		
				if row['Source'].get('@FolderName'):
					taskSourceFolder =  row['Source'].get('@FolderName','* Source Folder Not Found *')							
				print("\tSource Information: ",taskSourceFolder,taskSourcePattern)
		if row.get('For'):					
# Note the defensive code - the value of 2 indicates I'm about to die
			if len(row.get('For')) != 2:
				taskDestFolder =  row['For']['Destination'].get('@Path','* Destination Folder Not Found *')
				taskDestFile =  row['For']['Destination'].get('@FileName','* Destination File Not Found *')
				print("\tDest: ",taskDestFolder,taskDestFile)											
		print()
Here is an example of the error:

Traceback (most recent call last):
File "test3.py", line 19, in <module>
taskSourcePattern = row['Source'].get('@FileMask','* Source Pattern Not Found *')
AttributeError: 'list' object has no attribute 'get'
Reply
#2
why are you translating?
why not just read the native json format directly?
Is the posted data in original json format?
Reply
#3
This is typically problem that can be solved with [0].
JSON has often mixed in list.
Here a quick example with fix.
my_obj = {
    "name":"John",
    "age":30,
    "cars": [
        { "name":"Ford", "models":[ "Fiesta", "Focus", "Mustang" ] },
        { "name":"BMW", "models":[ "320", "X3", "X5" ] },
        { "name":"Fiat", "models":[ "500", "Panda" ] }
    ]
 }
Test:
>>> my_obj.get('age')
30

>>> my_obj.get('cars')
[{'models': ['Fiesta', 'Focus', 'Mustang'], 'name': 'Ford'},
 {'models': ['320', 'X3', 'X5'], 'name': 'BMW'},
 {'models': ['500', 'Panda'], 'name': 'Fiat'}]

>>> my_obj.get('cars').get('models')
Traceback (most recent call last):
  File "<string>", line 428, in runcode
  File "<interactive input>", line 1, in <module>
AttributeError: 'list' object has no attribute 'get'

>>> # Fix
>>> my_obj.get('cars')[0].get('models')
['Fiesta', 'Focus', 'Mustang']
>>> my_obj.get('cars')[1].get('models')
['320', 'X3', 'X5']
Reply
#4
You need to traverse JSON structure recursively if you don't know the exact layout beforehand.

You may also search XML directly with lxml package - something with iterfind will work.
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#5
which xml2json library you use? is there an option to always produce consistent output - i.e. array for Source, even when single element?
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Trying to get JSON object in python and process it further Creepy 2 986 Oct-24-2024, 08:46 AM
Last Post: buran
  Forcing matplotlib to NOT use scientific notation when graphing sawtooth500 4 4,776 Mar-25-2024, 03:00 AM
Last Post: sawtooth500
  ''.join and start:stop:step notation for lists ringgeest11 2 3,737 Jun-24-2023, 06:09 AM
Last Post: ferdnyc
  issue with converting a scientific notation to standard notation thomaswfirth 4 2,971 Jun-06-2023, 06:06 PM
Last Post: rajeshgk
  notation MCL169 8 3,088 Apr-14-2023, 12:06 PM
Last Post: MCL169
  Issue in writing sql data into csv for decimal value to scientific notation mg24 8 5,584 Dec-06-2022, 11:09 AM
Last Post: mg24
  Deserialize Complex Json to object using Marshmallow tlopezdh 2 2,910 Dec-09-2021, 06:44 PM
Last Post: tlopezdh
  Graphics Formatting - X-axis Notation and Annotations - Matplotlib silviover_junior 0 2,503 Mar-17-2021, 01:19 PM
Last Post: silviover_junior
  How to understand the byte notation in python3 blackknite 3 3,855 Feb-23-2021, 04:45 PM
Last Post: bowlofred
  finding and deleting json object GrahamL 1 5,865 Dec-10-2020, 04:11 PM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020