Python Forum
xml.etree.ElementTree extract string values
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
xml.etree.ElementTree extract string values
#1
Hi all,
I'am new to python. I am trying to parse an xml file and extract values between "><". Here the xml example :

<?xml version="1.0" encoding="UTF-8"?>
<DisplayDefinitionTable>
	<rows>
		<row>
			<object_tag tag="tagstr" uid="uidstr"/>
			<row_element column="0" component_tag="223011" property_name="property1">VALUE_STR_1</row_element>
			<row_element column="1" component_tag="223011" property_name="property2">VALUE_STR_2</row_element>
			<row_element column="2" component_tag="223011" property_name="property3">VALUE_STR_3</row_element>
			<row_element column="3" component_tag="223011" property_name="property4">VALUE_STR_4</row_element>
			<row_element column="4" component_tag="223011" property_name="property5">VALUE_STR_5</row_element>
			<row_element column="5" component_tag="1182129" property_name="property6">VALUE_STR_6</row_element>
			<row_element column="6" component_tag="81988" property_name="property7">VALUE_STR_7</row_element>
			<row_element column="7" component_tag="223011" property_name="property8">VALUE_STR_8</row_element>
			<row_element column="8" component_tag="223011" property_name="property9">VALUE_STR_9</row_element>
			<row_element column="9" component_tag="223011" property_name="property10">VALUE_STR_10</row_element>
		</row>
		
	</rows>
</DisplayDefinitionTable>[python]
[/python]

I'am trying to exrtract the value string for property1 between "><" (VALUE_STR_1) zu extrahieren.
Here the code :

from pathlib import Path
import os
import tempfile
import xml.etree.ElementTree as ET

srcpath = Path(__file__).parent.absolute()
os.chdir(srcpath)

tree = ET.parse("example.xml")
root = tree.iter()
#root = tree.getroot()

value= ""
PropertyName =""
for child in root:
     print(child.tag, child.attrib)
     if child.tag == "row_element":
        #print(child.tag,child.attrib)
        PropertyName=child.attrib.get('property_name')
        print('>>',PropertyName)
        value=child.findtext('PropertyName')
        print ("Value from ",PropertyName,":",value)
Attached the corresponding output :

DisplayDefinitionTable {}
rows {}
row {}
object_tag {'tag': 'tagstr', 'uid': 'uidstr'}
row_element {'column': '0', 'component_tag': '223011', 'property_name': 'property1'} 
>> property1
Value from  property1 : None
I did try various approaches but without success. I am under the impression the root element does not have those values at all. Any help or hint is highly appreciated

Thx
Matthias
Reply
#2
You can access text content simply with text attribute:
for child in root:
    print(child.tag, child.attrib)
    if child.tag == "row_element":
        # print(child.tag,child.attrib)
        PropertyName = child.attrib.get('property_name')
        print(f'>> {PropertyName}')
        value = child.text
        print(f'Value from {PropertyName}: {value}')
Reply
#3
Here one with BS as i never use ElementTree(has caused a lot of unnecessary problems for many people trough the years).
from bs4 import BeautifulSoup

soup = BeautifulSoup(open('comp.xml'), 'xml')
for row in soup.find_all('row_element'):
    print(row.text)
Output:
VALUE_STR_1 VALUE_STR_2 VALUE_STR_3 VALUE_STR_4 VALUE_STR_5 VALUE_STR_6 VALUE_STR_7 VALUE_STR_8 VALUE_STR_9 VALUE_STR_10
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  extract substring from a string before a word !! evilcode1 3 532 Nov-08-2023, 12:18 AM
Last Post: evilcode1
  Trying to compare string values in an if statement israelsattleen 1 541 Jul-08-2023, 03:49 PM
Last Post: deanhystad
  xml file editing with lxml.etree FlavioBueno 2 672 Jun-09-2023, 02:00 PM
Last Post: FlavioBueno
  Getting rid of old string values Pedroski55 3 1,007 Oct-11-2022, 10:56 PM
Last Post: Pedroski55
  mutable values to string items? fozz 15 2,781 Aug-30-2022, 07:20 PM
Last Post: deanhystad
  [SOLVED] [ElementTree] Grab text in attributes? Winfried 3 1,628 May-27-2022, 04:59 PM
Last Post: Winfried
  [ElementTree] Insert big block of HTML? Winfried 0 1,177 May-12-2022, 07:08 AM
Last Post: Winfried
  ElementTree get attribute value part of string paulo79 1 2,132 Apr-05-2022, 09:13 PM
Last Post: deanhystad
  PDF Extract using CSV values atomxkai 5 1,987 Jan-13-2022, 12:20 PM
Last Post: Pedroski55
  Extract a string between 2 words from a text file OscarBoots 2 1,866 Nov-02-2021, 08:50 AM
Last Post: ibreeden

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020