Python Forum
Object madness - JSON Notation confusion
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Object madness - JSON Notation confusion
#1
Good day folks. I am still a novice python programmer but got an urgent work task assigned that python is the answer to :) So, here I go.

I have the xml output from a commercial software product that I have transferred to JSON via XML2JSON. My goal is to locate and extract some specific strings of data. Under most circumstances, I am finding what I am needing. However, in certain cases, there are multiple rows of data apparently in a list of nested objects and I am struggling to code to extract them. My basic issue is how to syntactically reference them. i.e. I can't get the syntax to reference the nested elements.

The data example:

This node is successfully parsed:

"Source": {
"@ID": "11",
"@HostID": "0",
"@Type": "FileSystem",
"@Path": "D:\\PROD\\Product\\Customer\\FileType\\",
"@FileMask": "*.*",
"@DeleteOrig": "1",
"@NewFilesOnly": "0",
"@SearchSubdirs": "0",
"@Unzip": "0",
"@RetryIfNoFiles": "0",
"@UseDefRetryCount": "1",
"@UseDefRetryTimeoutSecs": "1",
"@UseDefRescanSecs": "1",
"@UDMxFi": "1",
"@UDMxBy": "1",
"@ExFo": "Archive",
"Criteria": {
"comp": {
"@a": "[FileDateStamp]",
"@test": "DLT",
"@b": "[DateSubtract([Now],12H)]"
}
}
},

It is parsed using this syntax:
				taskSourcePattern = row['Source'].get('@FileMask','* Source Pattern Not Found *')						
Under certain circumstances, there are multiple nodes in the 'Source' object:

"Source": [
{
"@ID": "11",
"@HostID": "0",
"@Type": "FileSystem",
"@Path": "D:\\PRODUCT\\CLIENT\\Outgoing",
"@FileMask": "*.*",
"@DeleteOrig": "1",
"@NewFilesOnly": "0",
"@SearchSubdirs": "0",
"@Unzip": "0",
"@RetryIfNoFiles": "0",
"@UseDefRetryCount": "1",
"@UseDefRetryTimeoutSecs": "1",
"@UseDefRescanSecs": "1",
"@UDMxFi": "1",
"@UDMxBy": "1"
},
{
"@ID": "17",
"@HostID": "0",
"@Type": "FileSystem",
"@Path": "D:\\PRODUCT\\CLIENT\\Outgoing2",
"@FileMask": "*.*",
"@DeleteOrig": "1",
"@NewFilesOnly": "0",
"@SearchSubdirs": "0",
"@Unzip": "0",
"@RetryIfNoFiles": "0",
"@UseDefRetryCount": "1",
"@UseDefRetryTimeoutSecs": "1",
"@UseDefRescanSecs": "1",
"@UDMxFi": "1",
"@UDMxBy": "1"
}
],
In this case, there is an extra '[' and ']' at the beginning and end of the list. As I iterate through this how do I handle the sudden inclusion of multiple nested records? If you look at my other forum post for another project I am working on, I am basically having the same issue. It's a learning problem :) Maybe you can teach me to fish.

Here is the full code list in this very early build:

import json
import sys

with open('output.json') as json_file:  
	data = json.load(json_file)	
	
	for row in data['Exported']['Tasks']['Task']:
		taskName = row['@Name']
		taskID = row['@ID']	
		print(taskID,"\t",taskName)

		if row.get('Source'):			
# Note the defensive code - the value of 2 indicates I'm about to die
			if len(row.get('Source')) != 2:	
				taskSourcePattern = row['Source'].get('@FileMask','* Source Pattern Not Found *')						
				if row['Source'].get('@Path'):			
					taskSourceFolder =  row['Source'].get('@Path','* Source Folder Not Found *')		
				if row['Source'].get('@FolderName'):
					taskSourceFolder =  row['Source'].get('@FolderName','* Source Folder Not Found *')							
				print("\tSource Information: ",taskSourceFolder,taskSourcePattern)
		if row.get('For'):					
# Note the defensive code - the value of 2 indicates I'm about to die
			if len(row.get('For')) != 2:
				taskDestFolder =  row['For']['Destination'].get('@Path','* Destination Folder Not Found *')
				taskDestFile =  row['For']['Destination'].get('@FileName','* Destination File Not Found *')
				print("\tDest: ",taskDestFolder,taskDestFile)											
		print()
Here is an example of the error:

Traceback (most recent call last):
File "test3.py", line 19, in <module>
taskSourcePattern = row['Source'].get('@FileMask','* Source Pattern Not Found *')
AttributeError: 'list' object has no attribute 'get'
Reply
#2
why are you translating?
why not just read the native json format directly?
Is the posted data in original json format?
Reply
#3
This is typically problem that can be solved with [0].
JSON has often mixed in list.
Here a quick example with fix.
my_obj = {
    "name":"John",
    "age":30,
    "cars": [
        { "name":"Ford", "models":[ "Fiesta", "Focus", "Mustang" ] },
        { "name":"BMW", "models":[ "320", "X3", "X5" ] },
        { "name":"Fiat", "models":[ "500", "Panda" ] }
    ]
 }
Test:
>>> my_obj.get('age')
30

>>> my_obj.get('cars')
[{'models': ['Fiesta', 'Focus', 'Mustang'], 'name': 'Ford'},
 {'models': ['320', 'X3', 'X5'], 'name': 'BMW'},
 {'models': ['500', 'Panda'], 'name': 'Fiat'}]

>>> my_obj.get('cars').get('models')
Traceback (most recent call last):
  File "<string>", line 428, in runcode
  File "<interactive input>", line 1, in <module>
AttributeError: 'list' object has no attribute 'get'

>>> # Fix
>>> my_obj.get('cars')[0].get('models')
['Fiesta', 'Focus', 'Mustang']
>>> my_obj.get('cars')[1].get('models')
['320', 'X3', 'X5']
Reply
#4
You need to traverse JSON structure recursively if you don't know the exact layout beforehand.

You may also search XML directly with lxml package - something with iterfind will work.
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#5
which xml2json library you use? is there an option to always produce consistent output - i.e. array for Source, even when single element?
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Forcing matplotlib to NOT use scientific notation when graphing sawtooth500 4 349 Mar-25-2024, 03:00 AM
Last Post: sawtooth500
  ''.join and start:stop:step notation for lists ringgeest11 2 2,423 Jun-24-2023, 06:09 AM
Last Post: ferdnyc
  issue with converting a scientific notation to standard notation thomaswfirth 4 1,355 Jun-06-2023, 06:06 PM
Last Post: rajeshgk
  notation MCL169 8 1,454 Apr-14-2023, 12:06 PM
Last Post: MCL169
  Issue in writing sql data into csv for decimal value to scientific notation mg24 8 3,010 Dec-06-2022, 11:09 AM
Last Post: mg24
  Deserialize Complex Json to object using Marshmallow tlopezdh 2 2,115 Dec-09-2021, 06:44 PM
Last Post: tlopezdh
  Graphics Formatting - X-axis Notation and Annotations - Matplotlib silviover_junior 0 1,780 Mar-17-2021, 01:19 PM
Last Post: silviover_junior
  How to understand the byte notation in python3 blackknite 3 2,904 Feb-23-2021, 04:45 PM
Last Post: bowlofred
  finding and deleting json object GrahamL 1 4,839 Dec-10-2020, 04:11 PM
Last Post: bowlofred
  Simple question concerning python dot notation. miner_tom 1 1,902 Mar-24-2020, 05:20 PM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020