Python Forum

Hi all,
I'am new to python. I am trying to parse an xml file and extract values between "><". Here the xml example :

<?xml version="1.0" encoding="UTF-8"?>
<DisplayDefinitionTable>
	<rows>
		<row>
			<object_tag tag="tagstr" uid="uidstr"/>
			<row_element column="0" component_tag="223011" property_name="property1">VALUE_STR_1</row_element>
			<row_element column="1" component_tag="223011" property_name="property2">VALUE_STR_2</row_element>
			<row_element column="2" component_tag="223011" property_name="property3">VALUE_STR_3</row_element>
			<row_element column="3" component_tag="223011" property_name="property4">VALUE_STR_4</row_element>
			<row_element column="4" component_tag="223011" property_name="property5">VALUE_STR_5</row_element>
			<row_element column="5" component_tag="1182129" property_name="property6">VALUE_STR_6</row_element>
			<row_element column="6" component_tag="81988" property_name="property7">VALUE_STR_7</row_element>
			<row_element column="7" component_tag="223011" property_name="property8">VALUE_STR_8</row_element>
			<row_element column="8" component_tag="223011" property_name="property9">VALUE_STR_9</row_element>
			<row_element column="9" component_tag="223011" property_name="property10">VALUE_STR_10</row_element>
		</row>
		
	</rows>
</DisplayDefinitionTable>[python]

[/python]

I'am trying to exrtract the value string for property1 between "><" (VALUE_STR_1) zu extrahieren.
Here the code :

from pathlib import Path
import os
import tempfile
import xml.etree.ElementTree as ET

srcpath = Path(__file__).parent.absolute()
os.chdir(srcpath)

tree = ET.parse("example.xml")
root = tree.iter()
#root = tree.getroot()

value= ""
PropertyName =""
for child in root:
     print(child.tag, child.attrib)
     if child.tag == "row_element":
        #print(child.tag,child.attrib)
        PropertyName=child.attrib.get('property_name')
        print('>>',PropertyName)
        value=child.findtext('PropertyName')
        print ("Value from ",PropertyName,":",value)

Attached the corresponding output :

DisplayDefinitionTable {}
rows {}
row {}
object_tag {'tag': 'tagstr', 'uid': 'uidstr'}
row_element {'column': '0', 'component_tag': '223011', 'property_name': 'property1'} 
>> property1
Value from  property1 : None

I did try various approaches but without success. I am under the impression the root element does not have those values at all. Any help or hint is highly appreciated

Thx
Matthias

You can access text content simply with text attribute:

for child in root:
    print(child.tag, child.attrib)
    if child.tag == "row_element":
        # print(child.tag,child.attrib)
        PropertyName = child.attrib.get('property_name')
        print(f'>> {PropertyName}')
        value = child.text
        print(f'Value from {PropertyName}: {value}')

Here one with BS as i never use ElementTree(has caused a lot of unnecessary problems for many people trough the years).

from bs4 import BeautifulSoup

soup = BeautifulSoup(open('comp.xml'), 'xml')
for row in soup.find_all('row_element'):
    print(row.text)

Output:VALUE_STR_1
VALUE_STR_2
VALUE_STR_3
VALUE_STR_4
VALUE_STR_5
VALUE_STR_6
VALUE_STR_7
VALUE_STR_8
VALUE_STR_9
VALUE_STR_10

matthias100

mlieqo

snippsat