xml.etree.ElementTree extract string values - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: xml.etree.ElementTree extract string values (/thread-28279.html) |
xml.etree.ElementTree extract string values - matthias100 - Jul-12-2020 Hi all, I'am new to python. I am trying to parse an xml file and extract values between "><". Here the xml example : <?xml version="1.0" encoding="UTF-8"?> <DisplayDefinitionTable> <rows> <row> <object_tag tag="tagstr" uid="uidstr"/> <row_element column="0" component_tag="223011" property_name="property1">VALUE_STR_1</row_element> <row_element column="1" component_tag="223011" property_name="property2">VALUE_STR_2</row_element> <row_element column="2" component_tag="223011" property_name="property3">VALUE_STR_3</row_element> <row_element column="3" component_tag="223011" property_name="property4">VALUE_STR_4</row_element> <row_element column="4" component_tag="223011" property_name="property5">VALUE_STR_5</row_element> <row_element column="5" component_tag="1182129" property_name="property6">VALUE_STR_6</row_element> <row_element column="6" component_tag="81988" property_name="property7">VALUE_STR_7</row_element> <row_element column="7" component_tag="223011" property_name="property8">VALUE_STR_8</row_element> <row_element column="8" component_tag="223011" property_name="property9">VALUE_STR_9</row_element> <row_element column="9" component_tag="223011" property_name="property10">VALUE_STR_10</row_element> </row> </rows> </DisplayDefinitionTable>[python][/python] I'am trying to exrtract the value string for property1 between "><" (VALUE_STR_1) zu extrahieren. Here the code : from pathlib import Path import os import tempfile import xml.etree.ElementTree as ET srcpath = Path(__file__).parent.absolute() os.chdir(srcpath) tree = ET.parse("example.xml") root = tree.iter() #root = tree.getroot() value= "" PropertyName ="" for child in root: print(child.tag, child.attrib) if child.tag == "row_element": #print(child.tag,child.attrib) PropertyName=child.attrib.get('property_name') print('>>',PropertyName) value=child.findtext('PropertyName') print ("Value from ",PropertyName,":",value)Attached the corresponding output : DisplayDefinitionTable {} rows {} row {} object_tag {'tag': 'tagstr', 'uid': 'uidstr'} row_element {'column': '0', 'component_tag': '223011', 'property_name': 'property1'} >> property1 Value from property1 : NoneI did try various approaches but without success. I am under the impression the root element does not have those values at all. Any help or hint is highly appreciated Thx Matthias RE: xml.etree.ElementTree extract string values - mlieqo - Jul-12-2020 You can access text content simply with text attribute:for child in root: print(child.tag, child.attrib) if child.tag == "row_element": # print(child.tag,child.attrib) PropertyName = child.attrib.get('property_name') print(f'>> {PropertyName}') value = child.text print(f'Value from {PropertyName}: {value}') RE: xml.etree.ElementTree extract string values - snippsat - Jul-12-2020 Here one with BS as i never use ElementTree(has caused a lot of unnecessary problems for many people trough the years). from bs4 import BeautifulSoup soup = BeautifulSoup(open('comp.xml'), 'xml') for row in soup.find_all('row_element'): print(row.text)
|