Python Forum
How do I get full XPath extract using Python?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How do I get full XPath extract using Python?
#1
How do I get full XPath extract using Python?
=====================================

Thanks for your response to my threads.

I am trying to use the pyhton code below. I am getting Abbreviated XPATH instead of FULL xpath. Exclamation What are the changes required to the code to get FULL XPATH?

from lxml import etree, objectify

def parseXML(xmlFile, outputFile):
    """
    Parse the XML function
    """
    with open(xmlFile) as fobj:
        xml = fobj.read()

    f = open(outputFile,'w') #open write to file
    root = etree.fromstring(xml)

    f.write("%s|%s\n" %("Field", "Value"))
    tree = etree.ElementTree(root)
    for e in root.iter():
        f.write("%s|%s\n" %(tree.getpath(e), e.text))

    f.close()

if __name__ == "__main__":
    print ('Loading variables...')
    input = 'inputf.xml'
    output = input + '.csv'

    parseXML(input,output)
I have a large XML file like (inputf.xml). I used this file as input = inputf.xml in above posted code.

Output:
INPUTXML <?xml version="1.0" encoding="UTF-8"?> <DataFileFor> <DataR> <Id>5070022019330a0050hq</Id> <NUM>30221730001019</NUM> <Postmark>2020-01-03T09:25:57.000-05:00</Postmark> <TNO>47647</TNO> . . . . . </DataFileFor>
++++

When grab the XPATH of Node using xml_grep, I am getting.

xml_grep DataFileFor/DataR/Ret/W2 inputf.xml ===> output

Output:
xml_grep DataFileFor/DataR/Ret/W2 inputf.xml <?xml version="1.0" ?> <xml_grep version="0.7" date="Fri Jun 26 13:07:11 2020"> <file filename="inputf.xml"> <W2 Id="W2" dName="W2" sId="00000000" sVersionNum="String"> <CorrectedW2Ind>X</CorrectedW2Ind> <EmployeeSSN>000000000</EmployeeSSN> <EmployerEIN>000000000</EmployerEIN> <EmployerNameControlTxt>S</EmployerNameControlTxt> <EmployerName> <BusinessNameLine1Txt>String</BusinessNameLine1Txt> <BusinessNameLine2Txt>String</BusinessNameLine2Txt> </EmployerName> <EmployerUSAddress> <AddressLine1Txt>String</AddressLine1Txt> <AddressLine2Txt>String</AddressLine2Txt> <CityNm>String</CityNm> <StateAbbreviationCd>AL</StateAbbreviationCd> <ZIPCd>000000000</ZIPCd> . . . . . </W2>
When I use this code, it is producing Abbreviated Xpaths instead of full XPath. The output XPATHS are like

Output:
[output]/DataFileFor/DataR/*[8]/*[2]/*[6]/*[3]/*[10]|X /DataFileFor/DataR/*[8]/*[2]/*[6]/*[3]/*[11]|00000000 /DataFileFor/DataR/*[8]/*[2]/*[6]/*[3]/*[12]|00000000 /DataFileFor/DataR/*[8]/*[2]/*[6]/*[3]/*[13]|S /DataFileFor/DataR/*[8]/*[2]/*[6]/*[3]/*[14]|String
[/output]

What are the changes required to the code to get FULL XPATH?

The attributes

Id="W2" dName="W2" sId="00000000" sVersionNum="String"> are not showing up in the output

What are the changes required to the code, to fix this?

Thanks for your guidance.
Reply
#2
Any thoughts ?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  python extract mg24 1 955 Nov-02-2022, 06:30 PM
Last Post: Larz60+
  IWhat is the cause to get XPath in weird format using Python? MDRI 7 3,691 May-27-2021, 02:01 AM
Last Post: MDRI
  How to append a tuple full of records to a dbf file in Python? DarkCoder2020 4 3,747 May-29-2020, 02:40 PM
Last Post: DarkCoder2020
  Need help to correct my python function for fetching full data! PrateekG 2 2,917 May-27-2018, 06:39 AM
Last Post: PrateekG

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020