Python Forum

Full Version: Help with urllib.request
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Help on this please; Is a second urlopen needed, in the scenario I describe?

import urllib.request

# Given you open a url
resp = urllib.request.urlopen('http://httpbin.org/xml')

# Then execute a read
stuff = resp.read()

# You can print the result
print(stuff)
# For this post pretend you see the result, it works.

# Then if you read again
stuff2 = resp.read()

# You find that nothing results
print(stuff2)
''

# If you do the open again, then you can read again and
# do get the results
resp = urllib.request.urlopen('http://httpbin.org/xml')

stuff2 = resp.read()

# I see the resp object has a seek function(?). But if that can be used
# to reset a pointer, instead of executing a urlopen again, I have not
# figured it out.
# If there is no real world reason to do a second read this way, then it
# is just an academic question. I am a Python newbie.
(Apr-19-2021, 01:53 AM)Brian177 Wrote: [ -> ]Help on this please; Is a second urlopen needed, in the scenario I describe?
A advice is not to use urllib.
Requests has taken over all task for many years ago in a better way.
Example.
>>> import requests
>>> 
>>> resp = requests.get('http://httpbin.org/xml')
>>> resp.status_code
200
>>> stuff = resp.text
>>> print(stuff)
Output:
<?xml version='1.0' encoding='us-ascii'?> <!-- A SAMPLE set of slides --> <slideshow title="Sample Slide Show" date="Date of publication" author="Yours Truly" > <!-- TITLE SLIDE --> <slide type="all"> <title>Wake up to WonderWidgets!</title> </slide> <!-- OVERVIEW --> <slide type="all"> <title>Overview</title> <item>Why <em>WonderWidgets</em> are great</item> <item/> <item>Who <em>buys</em> WonderWidgets</item> </slide> </slideshow>
As the output is xml so a is common way also to to add Bs4,so can parse result.
import requests
from bs4 import BeautifulSoup

url = 'http://httpbin.org/xml'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
title = soup.select_one('title')
print(title.text)
Output:
Wake up to WonderWidgets!
Thank you very much!