XML parsing from URL

***snippsat*** · Nov-21-2018, 04:34 PM

My take on this is that you should drop urllib and ElementTree all together.
So what to use insted?
For reading url and all other HTTP work use Requests.
For parsing lxml and BeautifulSoup.
Have tutorial here.

A example,solve first task.
lxml:

from lxml import html
import requests

url = 'http://py4e-data.dr-chuck.net/comments_42.xml'
response = requests.get(url)
tree = html.fromstring(response.content)
count = tree.xpath('//count')
total =  sum(int(i.text) for i in count)
print(f'The sum of all count is doc is: {total}')

Output:
The sum of all count in doc are: 2553

BeautifulSoup:

from bs4 import BeautifulSoup
import requests

url = 'http://py4e-data.dr-chuck.net/comments_42.xml'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
count = soup.find_all('count')
total =  sum(int(i.text) for i in count)
print(f'The sum of all count in doc are: {total}')

Output:
The sum of all count in doc are: 2553

XML parsing from URL

User Panel Messages

Announcements