Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 XML Parsing - Find a specific text (ElementTree)
Hi all,

I have a problem to parse a specific text from a xml file.

What I need is the Value (BB001234) of the IDTAG but I didn't know how to grab them.

Here is my full .xml file and my python code.
The problem is that the quantity of the "<DataPoint> ### </DataPoint>" can change.

Hopefully someone can help me.

Thank you

import os
from xml.etree import ElementTree

file_name = 'cumulus.xml'
full_file = os.path.abspath(os.path.join('data', file_name))
dom = ElementTree.parse(full_file)

assy = dom.findall('WorkOrders/CumulusWorkOrder/Assembly')

for c in assy:
    item = c.find('PartNumber').text
    serial = c.find('SerialLotNumber').text
    desc = c.find('Description').text.encode('utf-8')
    # idtag = c.find('IDTAG').text

    #print(' * {} - {} - {} - {}'.format(
    #    item, serial, desc, idtag
    print(' * {} - {} - {} - '.format(
        item, serial, desc

$ python * 1234567 - 1234567.abcdef - Item Description -
You can use lxml:
from lxml import etree
import os

tree = etree.parse('cumulus.xml')
# print(etree.tostring(tree))
elementPath ='/CumulusWorkOrderGroup/WorkOrders/CumulusWorkOrder/Assembly/DataPoints/DataPoint/Value'
element = tree.xpath(elementPath)
If you play with is a bit, you can get a better path (it's the 22nd DataPoint), that's why the 22 index here:
Note that you can iterate though 'element' if you don't know what the index is:
for n, item in enumerate(element):
    print(f'{n}: {etree.tostring(item)}')
0: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 1: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 2: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 3: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 4: b'<Value xmlns:xsi="" xmlns:xsd="">No</Value>\n' 5: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 6: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 7: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 8: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 9: b'<Value xmlns:xsi="" xmlns:xsd="">Optiklot</Value>\n' 10: b'<Value xmlns:xsi="" xmlns:xsd="">No</Value>\n' 11: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 12: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 13: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 14: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 15: b'<Value xmlns:xsi="" xmlns:xsd="">No</Value>\n' 16: b'<Value xmlns:xsi="" xmlns:xsd="">No</Value>\n' 17: b'<Value xmlns:xsi="" xmlns:xsd="">No</Value>\n' 18: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 19: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 20: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 21: b'<Value xmlns:xsi="" xmlns:xsd=""/>\n' 22: b'<Value xmlns:xsi="" xmlns:xsd="">BB001234</Value>\n' 23: b'<Value xmlns:xsi="" xmlns:xsd="">True</Value>\n'
The parsers in stand library is not the best,better of using lxml as Larz60+ show or BeautifulSoup with lxml as chosen parser.
from bs4 import BeautifulSoup

soup = BeautifulSoup(open("cumulus.xml"), 'lxml')
id_tag = soup.find("measurement", string="IDTAG")
Thanks to both of you!
This is really helpfull.

Best Regards

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Web crawler extracting specific text from HTML lewdow 1 614 Jan-03-2020, 11:21 PM
Last Post: snippsat
  Help on parsing simple text on HTML amaumox 5 233 Jan-03-2020, 05:50 PM
Last Post: amaumox
  Why doesn't my spider find body text? sigalizer 5 1,311 Oct-30-2019, 11:35 PM
Last Post: sigalizer
  Getting a specific text inside an html with soup mathieugrimbert 9 3,143 Jul-10-2019, 12:40 PM
Last Post: mathieugrimbert
  ElementTree kkrish 2 547 Apr-27-2019, 01:36 AM
Last Post: kkrish
  [split] How to find a specific word in a webpage and How to count it. marpop 2 645 Mar-12-2019, 08:25 AM
Last Post: snippsat
  How to find particular text from td tag using bs4 Prince_Bhatia 7 1,239 Sep-24-2018, 08:36 PM
Last Post: nilamo
  webscraping - failing to extract specific text from rontar 2 765 May-19-2018, 08:01 AM
Last Post: rontar
  BS4 Not Able To Find Text In CSS Comments digitalmatic7 4 1,770 Feb-27-2018, 03:45 AM
Last Post: digitalmatic7
  How to find a specific word in a webpage and How to count it. pratheep 11 21,552 Feb-08-2018, 04:07 PM
Last Post: pratheep

Forum Jump:

Users browsing this thread: 1 Guest(s)