Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help with urllib.request
#1
Help on this please; Is a second urlopen needed, in the scenario I describe?

import urllib.request

# Given you open a url
resp = urllib.request.urlopen('http://httpbin.org/xml')

# Then execute a read
stuff = resp.read()

# You can print the result
print(stuff)
# For this post pretend you see the result, it works.

# Then if you read again
stuff2 = resp.read()

# You find that nothing results
print(stuff2)
''

# If you do the open again, then you can read again and
# do get the results
resp = urllib.request.urlopen('http://httpbin.org/xml')

stuff2 = resp.read()

# I see the resp object has a seek function(?). But if that can be used
# to reset a pointer, instead of executing a urlopen again, I have not
# figured it out.
# If there is no real world reason to do a second read this way, then it
# is just an academic question. I am a Python newbie.
buran write Apr-19-2021, 05:09 AM:
Please, use proper tags when post code, traceback, output, etc. This time I have added tags for you.
See BBcode help for more info.
Reply
#2
(Apr-19-2021, 01:53 AM)Brian177 Wrote: Help on this please; Is a second urlopen needed, in the scenario I describe?
A advice is not to use urllib.
Requests has taken over all task for many years ago in a better way.
Example.
>>> import requests
>>> 
>>> resp = requests.get('http://httpbin.org/xml')
>>> resp.status_code
200
>>> stuff = resp.text
>>> print(stuff)
Output:
<?xml version='1.0' encoding='us-ascii'?> <!-- A SAMPLE set of slides --> <slideshow title="Sample Slide Show" date="Date of publication" author="Yours Truly" > <!-- TITLE SLIDE --> <slide type="all"> <title>Wake up to WonderWidgets!</title> </slide> <!-- OVERVIEW --> <slide type="all"> <title>Overview</title> <item>Why <em>WonderWidgets</em> are great</item> <item/> <item>Who <em>buys</em> WonderWidgets</item> </slide> </slideshow>
As the output is xml so a is common way also to to add Bs4,so can parse result.
import requests
from bs4 import BeautifulSoup

url = 'http://httpbin.org/xml'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
title = soup.select_one('title')
print(title.text)
Output:
Wake up to WonderWidgets!
Reply
#3
Thank you very much!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  urllib can't find "parse" rjdegraff42 6 2,178 Jul-24-2023, 05:28 PM
Last Post: deanhystad
  how can I correct the Bad Request error on my curl request tomtom 8 5,070 Oct-03-2021, 06:32 AM
Last Post: tomtom
  Prevent urllib.request from using my local proxy spacedog 0 2,877 Apr-24-2021, 08:55 PM
Last Post: spacedog
  urllib.request.ProxyHandler works with bad proxy spacedog 0 5,925 Apr-24-2021, 08:02 AM
Last Post: spacedog
  Need help with XPath using requests,time,urllib.request and BeautifulSoup spacedog 3 2,853 Apr-24-2021, 02:48 AM
Last Post: bowlofred
  urllib.request ericmt123 2 2,440 Dec-21-2020, 06:53 PM
Last Post: Larz60+
  Cannot open url link using urllib.request Askic 5 6,683 Oct-25-2020, 04:56 PM
Last Post: Askic
  urllib is not a package traceback cc26 3 5,404 Aug-28-2020, 09:34 AM
Last Post: snippsat
  ImportError: cannot import name 'Request' from 'request' abhishek81py 1 3,933 Jun-18-2020, 08:07 AM
Last Post: buran
  get file by proxy and header using urllib.request.urlretrieve randyjack 0 2,264 Mar-12-2020, 09:22 AM
Last Post: randyjack

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020