Python Forum
IWhat is the cause to get XPath in weird format using Python?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
IWhat is the cause to get XPath in weird format using Python?
#1
IWhat is the cause to get XPath in weird format using Python?
==============================================

Thanks for reviewing this thread.

I am trying to get the Xpaths for input IRSW2.xsd & W2TestfileS.xml. These files are attached to this thread.

Here is my code :

from lxml import etree, objectify

def parseXML(xmlFile, outputFile):
    """
    Parse the XML function
    """
    with open(xmlFile, 'rb') as fobj:
        xml = fobj.read()

    f = open(outputFile,'w') #open write to file
    root = etree.fromstring(xml)

    f.write("%s|%s\n" %("Field", "Value"))
    tree = etree.ElementTree(root)
    for e in root.iter():
        f.write("%s|%s\n" %(tree.getpath(e), e.text))

    f.close()

if __name__ == "__main__":
    print ('Loading variables...')
    input = 'IRSW2.xsd'
    output = input + '.out'

    parseXML(input,output)
My expectation of output as below:

/IRSW2
/IRSW2/EmployerName/BusinessNameLine1Txt
/IRSW2/W2StateLocalTaxGrp/W2StateTaxGrp
/IRSW2/WithholdingAmt
/IRSW2/EmployersUseGrp/EmployersUseCd
/IRSW2/OtherDeductionsBenefitsGrp/Amt
/IRSW2/W2StateLocalTaxGrp/W2StateTaxGrp/W2LocalTaxGrp/LocalWagesAndTipsAmt
Input XML messgae to get desired output

<?xml version="1.0" encoding="UTF-8"?>
<!--Sample XML file generated by XMLSpy v2020 rel. 2 (x64) (http://www.altova.com)-->
<IRSW2 xmlns="http://www.irs.gov/efile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" documentId="-" softwareId="00000000" softwareVersionNum="!" documentName="IRSW2" xsi:schemaLocation="http://www.irs.gov/efile IRSW2.xsd">
	<CorrectedW2Ind>X</CorrectedW2Ind>
	<EmployeeSSN>000000000</EmployeeSSN>
	<EmployerEIN>000000000</EmployerEIN>
	<EmployerNameControlTxt>&amp;</EmployerNameControlTxt>
	<AgentForEmployerInd>X</AgentForEmployerInd>
	<EmployerName>
		<BusinessNameLine1Txt>#</BusinessNameLine1Txt>
		<BusinessNameLine2Txt>#</BusinessNameLine2Txt>
	</EmployerName>
	<EmployerUSAddress>
		<AddressLine1Txt>0</AddressLine1Txt>
		<AddressLine2Txt>0</AddressLine2Txt>
		<CityNm>A</CityNm>
		<StateAbbreviationCd>AL</StateAbbreviationCd>
		<ZIPCd>00000</ZIPCd>
	</EmployerUSAddress>
	<ControlNum>!</ControlNum>
	<EmployeeNm>'</EmployeeNm>
	<EmployeeUSAddress>
		<AddressLine1Txt>0</AddressLine1Txt>
		<AddressLine2Txt>0</AddressLine2Txt>
		<CityNm>A</CityNm>
		<StateAbbreviationCd>AK</StateAbbreviationCd>
		<ZIPCd>00000</ZIPCd>
	</EmployeeUSAddress>
	<WagesAmt>0</WagesAmt>
	<WithholdingAmt>0</WithholdingAmt>
	<SocialSecurityWagesAmt>0</SocialSecurityWagesAmt>
	<SocialSecurityTaxAmt>0</SocialSecurityTaxAmt>
	<MedicareWagesAndTipsAmt>0</MedicareWagesAndTipsAmt>
	<MedicareTaxWithheldAmt>0</MedicareTaxWithheldAmt>
	<SocialSecurityTipsAmt>0</SocialSecurityTipsAmt>
	<AllocatedTipsAmt>0</AllocatedTipsAmt>
	<DependentCareBenefitsAmt>0</DependentCareBenefitsAmt>
	<NonqualifiedPlansAmt>0</NonqualifiedPlansAmt>
	<EmployersUseGrp>
		<EmployersUseCd>A</EmployersUseCd>
		<PriorUSERRAContributionYr>00</PriorUSERRAContributionYr>
		<EmployersUseAmt>0</EmployersUseAmt>
	</EmployersUseGrp>
	<StatutoryEmployeeInd>X</StatutoryEmployeeInd>
	<RetirementPlanInd>X</RetirementPlanInd>
	<ThirdPartySickPayInd>X</ThirdPartySickPayInd>
	<OtherDeductionsBenefitsGrp>
		<Desc>!</Desc>
		<Amt>0</Amt>
	</OtherDeductionsBenefitsGrp>
	<W2StateLocalTaxGrp>
		<W2StateTaxGrp>
			<StateAbbreviationCd>AS</StateAbbreviationCd>
			<EmployerStateIdNum>!</EmployerStateIdNum>
			<StateWagesAmt>0</StateWagesAmt>
			<StateIncomeTaxAmt>0</StateIncomeTaxAmt>
			<W2LocalTaxGrp>
				<LocalWagesAndTipsAmt>0</LocalWagesAndTipsAmt>
				<LocalIncomeTaxAmt>0</LocalIncomeTaxAmt>
				<LocalityNm>!</LocalityNm>
			</W2LocalTaxGrp>
		</W2StateTaxGrp>
	</W2StateLocalTaxGrp>
	<StandardOrNonStandardCd>N</StandardOrNonStandardCd>
	<W2SecurityInformation>
		<W2DownloadCd>0</W2DownloadCd>
		<W2DownloadResultCd>0</W2DownloadResultCd>
		<W2DownloadFailedAttemptCnt>0</W2DownloadFailedAttemptCnt>
	</W2SecurityInformation>
</IRSW2>
The above code produce for W2TestfileS.xml
Output:
Field|Value /*| /*/*[1]|X /*/*[2]|000000000 /*/*[3]|000000000 /*/*[4]|& /*/*[5]|X /*/*[6]| /*/*[6]/*[1]|# /*/*[6]/*[2]|# /*/*[7]| /*/*[7]/*[1]|0 /*/*[7]/*[2]|0 /*/*[7]/*[3]|A /*/*[7]/*[4]|AL /*/*[7]/*[5]|00000 /*/*[8]|! /*/*[9]|' /*/*[10]| /*/*[10]/*[1]|0 /*/*[10]/*[2]|0 /*/*[10]/*[3]|A /*/*[10]/*[4]|AK /*/*[10]/*[5]|00000 /*/*[11]|0 /*/*[12]|0 /*/*[13]|0 /*/*[14]|0 /*/*[15]|0 /*/*[16]|0 /*/*[17]|0 /*/*[18]|0 /*/*[19]|0 /*/*[20]|0 /*/*[21]| /*/*[21]/*[1]|A /*/*[21]/*[2]|00 /*/*[21]/*[3]|0 /*/*[22]|X /*/*[23]|X /*/*[24]|X /*/*[25]| /*/*[25]/*[1]|! /*/*[25]/*[2]|0 /*/*[26]| /*/*[26]/*| /*/*[26]/*/*[1]|AS /*/*[26]/*/*[2]|! /*/*[26]/*/*[3]|0 /*/*[26]/*/*[4]|0 /*/*[26]/*/*[5]| /*/*[26]/*/*[5]/*[1]|0 /*/*[26]/*/*[5]/*[2]|0 /*/*[26]/*/*[5]/*[3]|! /*/*[27]|N /*/*[28]| /*/*[28]/*[1]|0 /*/*[28]/*[2]|0 /*/*[28]/*[3]|0
The above code produce for IRSW2.xsd

Output:
Field|Value /xsd:schema| /xsd:schema/xsd:annotation| /xsd:schema/xsd:annotation/xsd:documentation| /xsd:schema/xsd:annotation/xsd:documentation/*[1]|IRS e-file Income Tax Schema - IRS Form W-2 Wage and Tax Statement /xsd:schema/xsd:annotation/xsd:documentation/*[2]|2020 /xsd:schema/xsd:annotation/xsd:documentation/*[3]|Final Schema Version RL104 Drop 3 1041 Family Form /xsd:schema/xsd:annotation/xsd:documentation/*[4]|Sept 2 2020 /xsd:schema/xsd:include|None /xsd:schema/comment()[1]| =============================================================== /xsd:schema/comment()[2]| ======================= IRS Form W-2 ========================== /xsd:schema/comment()[3]| =============================================================== /xsd:schema/xsd:element| /xsd:schema/xsd:element/xsd:annotation| /xsd:schema/xsd:element/xsd:annotation/xsd:documentation|IRS Form W-2 /xsd:schema/xsd:element/xsd:complexType| /xsd:schema/xsd:element/xsd:complexType/xsd:complexContent| /xsd:schema/xsd:element/xsd:complexType/xsd:complexContent/xsd:extension| /xsd:schema/xsd:element/xsd:complexType/xsd:complexContent/xsd:extension/xsd:attributeGroup| /xsd:schema/xsd:element/xsd:complexType/xsd:complexContent/xsd:extension/xsd:attributeGroup/xsd:annotation| /xsd:schema/xsd:element/xsd:complexType/xsd:complexContent/xsd:extension/xsd:attributeGroup/xsd:annotation/xsd:documentation|Common return document attributes /xsd:schema/xsd:element/xsd:complexType/xsd:complexContent/xsd:extension/xsd:attribute| /xsd:schema/xsd:element/xsd:complexType/xsd:complexContent/xsd:extension/xsd:attribute/xsd:annotation| /xsd:schema/xsd:element/xsd:complexType/xsd:complexContent/xsd:extension/xsd:attribute/xsd:annotation/xsd:documentation|IRS internal use only. To avoid error in the return, do not include the attribute name or value. /xsd:schema/xsd:complexType[1]| /xsd:schema/xsd:complexType[1]/xsd:annotation| /xsd:schema/xsd:complexType[1]/xsd:annotation/xsd:documentation|Content model for Form W-2 /xsd:schema/xsd:complexType[1]/xsd:sequence| /xsd:schema/xsd:complexType[1]/xsd:sequence/comment()[1]| Corrected W2 Indicator /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[1]| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[1]/xsd:annotation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[1]/xsd:annotation/xsd:documentation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[1]/xsd:annotation/xsd:documentation/*[1]|Corrected W2 Indicator /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[1]/xsd:annotation/xsd:documentation/*[2]|0010 /xsd:schema/xsd:complexType[1]/xsd:sequence/comment()[2]| Employee SSN /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[2]| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[2]/xsd:annotation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[2]/xsd:annotation/xsd:documentation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[2]/xsd:annotation/xsd:documentation/*[1]|Employee SSN /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[2]/xsd:annotation/xsd:documentation/*[2]|a /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[2]/xsd:annotation/xsd:documentation/*[3]|0035 /xsd:schema/xsd:complexType[1]/xsd:sequence/comment()[3]| Employer EIN /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[3]| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[3]/xsd:annotation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[3]/xsd:annotation/xsd:documentation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[3]/xsd:annotation/xsd:documentation/*[1]|Employer EIN /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[3]/xsd:annotation/xsd:documentation/*[2]|b /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[3]/xsd:annotation/xsd:documentation/*[3]|0040 /xsd:schema/xsd:complexType[1]/xsd:sequence/comment()[4]| Employer Name Control /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[4]| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[4]/xsd:annotation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[4]/xsd:annotation/xsd:documentation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[4]/xsd:annotation/xsd:documentation/*[1]|c /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[4]/xsd:annotation/xsd:documentation/*[2]|0045 /xsd:schema/xsd:complexType[1]/xsd:sequence/comment()[5]| Agent for Employer Indicator /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[5]| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[5]/xsd:annotation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[5]/xsd:annotation/xsd:documentation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[5]/xsd:annotation/xsd:documentation/*[1]|Agent for Employer Indicator /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[5]/xsd:annotation/xsd:documentation/*[2]|c /xsd:schema/xsd:complexType[1]/xsd:sequence/comment()[6]| Employer Name /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[6]| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[6]/xsd:annotation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[6]/xsd:annotation/xsd:documentation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[6]/xsd:annotation/xsd:documentation/*[1]|Employer name /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[6]/xsd:annotation/xsd:documentation/*[2]|c /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:element[6]/xsd:annotation/xsd:documentation/*[3]|0050 0055 /xsd:schema/xsd:complexType[1]/xsd:sequence/comment()[7]| Employer Address /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:choice[1]| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:choice[1]/comment()[1]| Employer US Address /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:choice[1]/xsd:element[1]| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:choice[1]/xsd:element[1]/xsd:annotation| /xsd:schema/xsd:complexType[1]/xsd:sequence/xsd:choice[1]/xsd:element[1]/xsd:annotation/xsd:documentation| ...... .......
*** It is not allowing to upload an .xsd file to this thread. Let me know, how I could upload the file?

How do I fix the above code to get desire XPATH as shown above ?

Thanks for your guidance.
.xml   W2TestfileS.xml (Size: 2.73 KB / Downloads: 360)
.zip   IRSW2.zip (Size: 2.71 KB / Downloads: 302)
Reply


Messages In This Thread
IWhat is the cause to get XPath in weird format using Python? - by MDRI - May-20-2021, 03:04 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Weird behaviour using if statement in python 3.10.8 mikepy 23 3,750 Jan-18-2023, 04:51 PM
Last Post: mikepy
  How do I get full XPath extract using Python? MDRI 1 2,233 Sep-18-2020, 02:13 AM
Last Post: MDRI
  python being weird justindiaz7474 0 1,385 May-05-2020, 10:19 PM
Last Post: justindiaz7474
  [split] Python beginner: Weird Syntax Error mnsaathvika 1 2,161 Jul-22-2019, 06:14 AM
Last Post: buran
  Python 3.6.5 pathlib weird behaviour when resolve a relative path on root (macOs) QbLearningPython 7 6,209 May-29-2018, 08:38 AM
Last Post: QbLearningPython
  python nested list assignment weird behavior eyalk1 2 4,483 Jan-16-2018, 07:32 PM
Last Post: wavic
  Python beginner: Weird Syntax Error mentoly 5 10,381 Oct-13-2017, 08:06 AM
Last Post: gruntfutuk

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020