Python Forum
XML using xml.etree.ElementTree Question
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
XML using xml.etree.ElementTree Question
#1
Please advise me on the following issue. I have 3 sections below: data, Python code, and output. It looks like the event start does not always give me the elem.text value. In this case, it will give me “None”. The start event is not 100% reliable to obtain a tag value. The end event seems to provide value each time. Is this a bug in Python programming language? Is there a better way to get tag value from start even?

File name: XMLD.DOCUMT.xml

Line 148:
<didFundLendSecurities>N</didFundLendSecurities>

Line 322:
<didFundLendSecurities>N</didFundLendSecurities>

Line 495:
<didFundLendSecurities>N</didFundLendSecurities>

--

import xml.etree.cElementTree as ET
import random
import sys
import os
import collections

v_file_name = "XMLD.DOCUMT.xml"

tree = ET.iterparse(v_file_name, events = ("start", "end"))
children = iter(tree)
for event, elem in children:
v_elem_tag = str(elem.tag.split('}')[1])
v_elem_tag_attrib = str(elem.attrib)
#Tags withoug attrib go here
if (len(v_elem_tag_attrib) == 2):
if (event == "start") and (v_elem_tag == "didFundLendSecurities"):
print("Start: ", v_elem_tag, elem.text)
if (event == "end") and (v_elem_tag == "didFundLendSecurities"):
print("End: ", v_elem_tag, elem.text)
#Tags with attrib go here
else:
pass

--

Start: didFundLendSecurities N
End: didFundLendSecurities N

Start: didFundLendSecurities None
End: didFundLendSecurities N

Start: didFundLendSecurities N
End: didFundLendSecurities N
Reply
#2
It is not a bug: the documentation says
Quote:Note iterparse() only guarantees that it has seen the “>” character of a starting tag when it emits a “start” event, so the attributes are defined, but the contents of the text and tail attributes are undefined at that point. The same applies to the element children; they may or may not be present.
If you need a fully populated element, look for “end” events instead.

Obviously, the reason is that the text has not yet been read when the iterative parser emits the start event.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  xml file editing with lxml.etree FlavioBueno 2 696 Jun-09-2023, 02:00 PM
Last Post: FlavioBueno
  [SOLVED] [ElementTree] Grab text in attributes? Winfried 3 1,651 May-27-2022, 04:59 PM
Last Post: Winfried
  [ElementTree] Insert big block of HTML? Winfried 0 1,192 May-12-2022, 07:08 AM
Last Post: Winfried
  ElementTree get attribute value part of string paulo79 1 2,175 Apr-05-2022, 09:13 PM
Last Post: deanhystad
  xml.etree.ElementTree question. water 0 3,303 Oct-09-2020, 06:47 PM
Last Post: water
  xml.etree.ElementTree extract string values matthias100 2 5,016 Jul-12-2020, 06:02 PM
Last Post: snippsat
  [pykml] "AttributeError: 'lxml.etree._ElementTree' object has no attribute 'Document' Winfried 3 6,624 May-26-2020, 09:30 PM
Last Post: Winfried
  Python 3 Elementtree and Comment First gw1500se 3 3,766 May-25-2020, 09:02 PM
Last Post: gw1500se
  Write the XML file from elementtree with hexa decimal encoding Dillibabu 4 3,495 Dec-24-2019, 10:10 AM
Last Post: Dillibabu
  looking for sample py to read from txt file into XML using lxml import etree venkat18 3 3,014 Jun-02-2019, 04:34 AM
Last Post: venkat18

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020