Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
URLLIB.REQUEST Not Working
#1
import urllib.request
try:
    url = 'https://www.whoscored.com/Players/5583/Show/Cristiano-Ronaldo'

    # now, with the below headers, we defined ourselves as a simpleton who is
    # still using internet explorer.
    headers = {}
    headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"
    req = urllib.request.Request(url, headers = headers)
    resp = urllib.request.urlopen(req)
    respData = resp.read()
    print(respData)

    saveFile = open('withHeaders.txt','w')
    saveFile.write(str(respData))
    saveFile.close()
except Exception as e:
    print(str(e))
The above code is not returning the source of the page properly. It is returning as

Error:
b'<html style="height:100%"><head><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"><meta name="format-detection" content="telephone=no"><meta name="viewport" content="initial-scale=1.0"><meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"></head><body style="margin:0px;height:100%"><iframe src="/_Incapsula_Resource?CWUDNSAI=9&xinfo=1-67208763-0%200NNN%20RT%281505746830921%2010%29%20q%280%20-1%20-1%20-1%29%20r%280%20-1%29%20B12%284%2c316%2c0%29%20U2&incident_id=500000570207585028-516670650047529873&edet=12&cinfo=04000000" frameborder=0 width="100%" height="100%" marginheight="0px" marginwidth="0px">Request unsuccessful. Incapsula incident ID: 500000570207585028-516670650047529873</iframe></body></html>'
Reply
#2
Use requests, see https://python-forum.io/Thread-Web-Scraping-part-1
and/or https://python-forum.io/Thread-Web-scraping-part-2
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  urllib can't find "parse" rjdegraff42 6 1,971 Jul-24-2023, 05:28 PM
Last Post: deanhystad
  how can I correct the Bad Request error on my curl request tomtom 8 4,966 Oct-03-2021, 06:32 AM
Last Post: tomtom
  Prevent urllib.request from using my local proxy spacedog 0 2,803 Apr-24-2021, 08:55 PM
Last Post: spacedog
  urllib.request.ProxyHandler works with bad proxy spacedog 0 5,852 Apr-24-2021, 08:02 AM
Last Post: spacedog
  Need help with XPath using requests,time,urllib.request and BeautifulSoup spacedog 3 2,798 Apr-24-2021, 02:48 AM
Last Post: bowlofred
  Help with urllib.request Brian177 2 2,839 Apr-21-2021, 01:58 PM
Last Post: Brian177
  urllib.request ericmt123 2 2,389 Dec-21-2020, 06:53 PM
Last Post: Larz60+
  Cannot open url link using urllib.request Askic 5 6,569 Oct-25-2020, 04:56 PM
Last Post: Askic
  urllib is not a package traceback cc26 3 5,294 Aug-28-2020, 09:34 AM
Last Post: snippsat
  ImportError: cannot import name 'Request' from 'request' abhishek81py 1 3,859 Jun-18-2020, 08:07 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020