Open URL via proxy - perseus142 - Jun-02-2020
Hello team,
I am newbie and I am looking for help.
I would like to access a webpage and retrieve data - sound easy, right ?
However I have to use proxy. Here I am stuck :(
I have already tried some google help or stackowerflow advises, but I am still getting errors.
For example: https://stackoverflow.com/questions/34576665/setting-proxy-to-urllib-request-python3
My code #1:
import urllib.request as request
proxy_handler = request.ProxyHandler({'http': '<proxy omitted>'})
opener = request.build_opener(proxy_handler)
url = 'http://data.pr4e.org/romeo.txt'
# open the website with the opener
req = opener.open(url)
data = req.read().decode('utf8')
print(data) Error:
Error: PS C:\Users\<user>\Desktop> python .\week4.py
Traceback (most recent call last):
File ".\week4.py", line 33, in <module>
req = opener.open(url)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
My code #2:
from urllib import request as urlrequest
proxy_host = '<proxy omitted>' # host and port of your proxy
url = 'http://data.pr4e.org/romeo.txt'
req = urlrequest.Request(url)
req.set_proxy(proxy_host, 'http')
response = urlrequest.urlopen(req)
print(response.read().decode('utf8')) Error: Error: PS C:\Users\<user>\Desktop> python .\week4.py
Traceback (most recent call last):
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\http\client.py", line 865, in _get_hostport
port = int(host[i+1:])
ValueError: invalid literal for int() with base 10: '8001/one-de-vpn.pac'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File ".\week4.py", line 29, in <module>
response = urlrequest.urlopen(req)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 542, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 1348, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 1288, in do_open
h = http_class(host, timeout=req.timeout, **http_conn_args)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\http\client.py", line 829, in __init__
(self.host, self.port) = self._get_hostport(host, port)
File "C:\Users\<user>\AppData\Local\Programs\Python\Python38\lib\http\client.py", line 870, in _get_hostport
raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
http.client.InvalidURL: nonnumeric port: '8001/one-de-vpn.pac'
Website http://data.pr4e.org/romeo.txt is accessible via the proxy (when using browser).
Please advise.
Thank you.
RE: Open URL via proxy - perseus142 - Jun-17-2020
please disregard and delete the topic.
RE: Open URL via proxy - micseydel - Jun-18-2020
We don't delete topics, but if you found the solution we'd appreciate you sharing it with us here in case someone finds it helpful in the future.
|