Reading and Extracting XML

Python4Ever · Dec-12-2017, 03:58 PM

Can someone please tell me why this code isn't printing integers for the count items list? The output is encoded as utf-8, but I thought the fromstring() parses xml directly to the element lst, from which can pull out elements. If I need to decode lst, how would I do that?

import urllib.request, urllib.parse, urllib.error
import xml.etree.ElementTree as ET

xml = 'http://py4e-data.dr-chuck.net/comments_42.xml'

url = urllib.request.urlopen(xml)
data = url.read()
print(data.decode())
tree = ET.fromstring(data)
lst = tree.findall('.//count')
print(lst)

Here is the output of print(lst):
[<Element 'count' at 0x000000000BD1EBD8>, <Element 'count' at 0x000000000BD1EEA8>, <Element 'count' at 0x000000000BD1EF98>, <Element 'count' at 0x000000000BD28228>, <Element 'count' at 0x000000000BD28318>, <Element 'count' at 0x000000000BD28408>, <Element 'count' at 0x000000000BD284F8>, <Element 'count' at 0x000000000BD285E8>, <Element 'count' at 0x000000000BD286D8>, <Element 'count' at 0x000000000BD287C8>, <Element 'count' at 0x000000000BD288B8>, <Element 'count' at 0x000000000BD289A8>, <Element 'count' at 0x000000000BD28A98>, <Element 'count' at 0x000000000BD28B88>, <Element 'count' at 0x000000000BD28C78>, <Element 'count' at 0x000000000BD28D68>, <Element 'count' at 0x000000000BD28E58>, <Element 'count' at 0x000000000BD28F48>, <Element 'count' at 0x000000000BD2B098>, <Element 'count' at 0x000000000BD2B188>, <Element 'count' at 0x000000000BD2B278>, <Element 'count' at 0x000000000BD2B368>, <Element 'count' at 0x000000000BD2B458>, <Element 'count' at 0x000000000BD2B548>, <Element 'count' at 0x000000000BD2B638>, <Element 'count' at 0x000000000BD2B728>, <Element 'count' at 0x000000000BD2B818>, <Element 'count' at 0x000000000BD2B908>, <Element 'count' at 0x000000000BD2B9F8>, <Element 'count' at 0x000000000BD2BAE8>, <Element 'count' at 0x000000000BD2BBD8>, <Element 'count' at 0x000000000BD2BCC8>, <Element 'count' at 0x000000000BD2BDB8>, <Element 'count' at 0x000000000BD2BEA8>, <Element 'count' at 0x000000000BD2BF98>, <Element 'count' at 0x000000000BD2E0E8>, <Element 'count' at 0x000000000BD2E1D8>, <Element 'count' at 0x000000000BD2E2C8>, <Element 'count' at 0x000000000BD2E3B8>, <Element 'count' at 0x000000000BD2E4A8>, <Element 'count' at 0x000000000BD2E598>, <Element 'count' at 0x000000000BD2E688>, <Element 'count' at 0x000000000BD2E778>, <Element 'count' at 0x000000000BD2E868>, <Element 'count' at 0x000000000BD2E958>, <Element 'count' at 0x000000000BD2EA48>, <Element 'count' at 0x000000000BD2EB38>, <Element 'count' at 0x000000000BD2EC28>, <Element 'count' at 0x000000000BD2ED18>, <Element 'count' at 0x000000000BD2EE08>]

***snippsat*** · Dec-12-2017, 04:19 PM

Use code tag BBcode help.
You have to extract text for object in list.

import urllib.request, urllib.parse, urllib.error
import xml.etree.ElementTree as ET

xml = 'http://py4e-data.dr-chuck.net/comments_42.xml'
url = urllib.request.urlopen(xml)
data = url.read()
#print(data.decode())
tree = ET.fromstring(data)
lst = tree.findall('.//count')
for item in lst:
    print(item.text)

Python4Ever · Dec-12-2017, 04:32 PM

Of course. Thank you!!

Reading and Extracting XML

User Panel Messages

Announcements