Python Forum
lxml - etree/lxml need help storing variable for most inserted element
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
lxml - etree/lxml need help storing variable for most inserted element
#1
Hello benevolent python vets,

This is a new (and very desperate/frustrated) member requesting your assistance here.

I have had great success with python lxml for the past few months but hit another roadblock with this newest endeavor I am attempting. My apologies for this long post, but I get worried about not supplying enough info to get the help needed to resolve this so I can stop banging the proverbial head against the wall.


Below is a very simplified example of what the xml looks like today:

<Schema xmlns:ofda="http://www.ofdaxml.org" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ofdaxml.org/schema" elementFormDefault="qualified" attributeFormDefault="unqualified" version="01.00.00">
  <Envelope SchemaVersion="01.04.00">
    <ExternalReference>
      <Usage>
        <Type>CETSymbolsDir</Type>
        <OtherType>\</OtherType>
      </Usage>
    </ExternalReference>
    <Enterprise>
      <Code>www.Something.com</Code>
      <Name Language="en-US">Some Name</Name>
      <Description Language="en-US">Something</Description>
      <UnitMeasure>in</UnitMeasure>
      <Material>
        <Code>BL</Code>
        <Description Language="en-US">Blue</Description>
        <ExternalReference>
          <FileURI>Blue.jpg</FileURI>
          <Usage>
            <Type>Image</Type>
            <Quality>Medium</Quality>
          </Usage>
        </ExternalReference>
      </Material>
      <Material>
        <Code>WH</Code>
        <Description Language="en-US">White</Description>
        <ExternalReference>
          <FileURI>White.jpg</FileURI>
          <Usage>
            <Type>Image</Type>
            <Quality>Medium</Quality>
          </Usage>
        </ExternalReference>
      </Material>
      <Vendor>
        <Code>www.Something.Com</Code>
      </Vendor>
      <Product>
        <Code>AA</Code>
        <Description Language="en-US">Tray</Description>
        <Price>
            <Value>787</Value>
        </Price>
      </Product>
      <Product>
        <Code>BB</Code>
        <Description Language="en-US">Chair</Description>
        <Price>
            <Value>525</Value>
        </Price>
      </Product>     
    </Enterprise>
  </Envelope>
</Schema>
I have a list of new Material data but the issue I am having is inserting all this new data just before the Product elements as they are both children underneath the Enterprise element. Although the location wouldn't necessarily impact the xml , I would prefer to keep all the Material data in the same vicinity just in case our vendor ever changes up their xml structure in the future.

The list of new material data is listed in a dictionary with almost 200 keys (truth is I have about 5000 material tags already in the original xml). Here is a simplified example of what I am hoping to achieve:

<Schema xmlns:ofda="http://www.ofdaxml.org" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ofdaxml.org/schema" elementFormDefault="qualified" attributeFormDefault="unqualified" version="01.00.00">
  <Envelope SchemaVersion="01.04.00">
    <ExternalReference>
      <Usage>
        <Type>CETSymbolsDir</Type>
        <OtherType>\</OtherType>
      </Usage>
    </ExternalReference>
    <Enterprise>
      <Code>www.Something.com</Code>
      <Name Language="en-US">Some Name</Name>
      <Description Language="en-US">Something</Description>
      <UnitMeasure>in</UnitMeasure>
      <Material>
        <Code>BL</Code>
        <Description Language="en-US">Blue</Description>
        <ExternalReference>
          <FileURI>BL.jpg</FileURI>
          <Usage>
            <Type>Image</Type>
            <Quality>Medium</Quality>
          </Usage>
        </ExternalReference>
      </Material>
      <Material>
        <Code>WH</Code>
        <Description Language="en-US">White</Description>
        <ExternalReference>
          <FileURI>WH.jpg</FileURI>
          <Usage>
            <Type>Image</Type>
            <Quality>Medium</Quality>
          </Usage>
        </ExternalReference>
      </Material>
      <Material>  #First new material added
        <Code>RD</Code>
        <Description Language="en-US">Red</Description>
        <ExternalReference>
          <FileURI>RD.jpg</FileURI>
          <Usage>
            <Type>Image</Type>
            <Quality>Medium</Quality>
          </Usage>
        </ExternalReference>
      </Material>
      <Material>  #Second new material added
        <Code>YL</Code>
        <Description Language="en-US">Yellow</Description>
        <ExternalReference>
          <FileURI>YL.jpg</FileURI>
          <Usage>
            <Type>Image</Type>
            <Quality>Medium</Quality>
          </Usage>
        </ExternalReference>
      </Material>
      <Vendor>
        <Code>www.Something.Com</Code>
      </Vendor>
      <Product>
        <Code>AA</Code>
        <Description Language="en-US">Tray</Description>
        <Price>
            <Value>787</Value>
        </Price>
      </Product>
      <Product>
        <Code>BB</Code>
        <Description Language="en-US">Chair</Description>
        <Price>
            <Value>525</Value>
        </Price>
      </Product>     
    </Enterprise>
  </Envelope>
</Schema>
Here is a condensed version of that code I have written:

import pandas as pd
from lxml import etree as ET

xmlpath ='Catalog.XML'
parser = ET.XMLParser(remove_blank_text=True)
tree = ET.parse(xmlpath, parser)

mtag = = ET.Element("Material")

matcount = tree.xpath('count(.//Material)')
entfirst = tree.find('.//Enterprise')  #element of parent element for the Material elements
matfirst = tree.find('.//Material')
matindx = entfirst.index(matfirst)
indx = int(matcount + matindx)    #Index where new material element needs to inserted

#matlist is a list of new materials that need to be added that can be looked up in a  dictionary I have with the material description. Too much code to reveal here and it works fine, so figured it would be irrelevant to include.

for mat in matlist:   earlier 
        mtag = entfirst.insert(pappindx, "Material")
        

        ctag = ET.SubElement(mtag, "Code")
        ctag.text = mat

        dtag = ET.SubElement(mtag, "Description", Language = 'en-US')
        
        #matdict is a dictionary that stores the description for the materials
        dtag.text = matdict.get(mat, "")   

        xtag = ET.SubElement(mtag, "ExternalReference")

       urtag = ET.SubElement(xtag, "FileURI")
        urtag.text = mat + ".gm"

        ustag = ET.SubElement(xtag, "Usage")
        ttag = ET.SubElement(ustag, "Type")
        ttag.text = "SwatchImage"
        qtag = ET.SubElement(ustag, "Quality")
        qtag.text = "Medium"

        indx = indx+1

tree.write('output2.xml', pretty_print=True)
I can't get the mtag variable (first variable in loop) to store a variable though and this breaks the whole script since the subsequent subelements have no reference point to be appended to....only way I have gotten this script to work is by making mtag much like its child elements with the SubElement function; however, this appended everything near the bottom of the xml and doesn't leave me room to properly insert the new Materials in the xml.


Any help here would be appreciated as I know people are taking time out of their day to help a fledgling Python user. I have researched as best as I could but there are only so much trial and error/research one can go through before knowing when they are out of their element (pun intended!)


My apologies in advance for any confusion with my watered down example and am more than happy to elaborate further or provide additional examples.


Thank you in advance for your time and response to my conundrum.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Variable for the value element in the index function?? Learner1 8 626 Jan-20-2024, 09:20 PM
Last Post: Learner1
  xml file editing with lxml.etree FlavioBueno 2 669 Jun-09-2023, 02:00 PM
Last Post: FlavioBueno
  [SOLVED] [BeautifulSoup] Why does it turn inserted string's brackets into &lt;/&gt;? Winfried 0 1,502 Sep-03-2022, 11:21 PM
Last Post: Winfried
  How would I use Watchdog to get triggered when DVD is inserted? Daring_T 12 4,705 Aug-17-2021, 01:49 PM
Last Post: Daring_T
  Installing lxml and getting this warning Led_Zeppelin 5 5,394 Aug-11-2021, 08:57 PM
Last Post: Led_Zeppelin
  Need help with lxml.html and xpath spacedog 5 3,180 May-01-2021, 02:00 PM
Last Post: snippsat
  How to make input come after input if certain line inserted and if not runs OtherCode Adrian_L 6 3,321 Apr-04-2021, 06:10 PM
Last Post: Adrian_L
  xml.etree.ElementTree question. water 0 3,275 Oct-09-2020, 06:47 PM
Last Post: water
  xml.etree.ElementTree extract string values matthias100 2 4,950 Jul-12-2020, 06:02 PM
Last Post: snippsat
  [pykml] "AttributeError: 'lxml.etree._ElementTree' object has no attribute 'Document' Winfried 3 6,569 May-26-2020, 09:30 PM
Last Post: Winfried

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020