Python Forum

Full Version: urllib2.urlopen() user agent header
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
urllib2.urlopen() sends a user agent HTTP header in the request that identifies the library.  what i want to do is send the request without any user agent header at all (just the host header).  does anyone know how to do that?  i don't see it in the docs. i do see vague (not enough detail to know what to do) instructions on how to change it, but nothing about how to remove it. do i need to just make a plain TCP connection and do my own HTTP?

Output:
User-Agent: Python-urllib/3.5
You can manipulate user-agent,the easiest way is of course to use Requests(should always be used anyway and not urllib).
λ ptpython 
>>> import requests

>>> r = requests.get("https://httpbin.org/headers", headers={"user-agent": "My cool Useragent" })
>>> print(r.text)
{
 "headers": {
   "Accept": "*/*",
   "Accept-Encoding": "gzip, deflate",
   "Connection": "close",
   "Host": "httpbin.org",
   "User-Agent": "My cool Useragent"
 }
}
I have a post here where i us urllib with FancyURLopener
This post set user-agent with Requests to get access.
i don't want to set user agent, i want to unset it so the header is absent.
import urllib2

request = urllib2.Request("http://domain.com", headers={'User-agent': ''})
response = urllib2.urlopen(request).read()
(Jun-29-2017, 05:45 AM)Skaperen Wrote: [ -> ]i don't want to set user agent, i want to unset it so the header is absent.
Just try and see how much you can get away with before no connection.
import requests

headers = {
    'user-agent': '',
    'values': '',
    }

r = requests.get("https://httpbin.org/headers", headers=headers)
print(r.text)
Output:
{  "headers": {    "Accept": "*/*",    "Accept-Encoding": "gzip, deflate",    "Connection": "close",      "Host": "httpbin.org",    "User-Agent": "",    "Values": ""  } }
i guess i will just go ahead and make a plain TCP connection and do minimal HTTP mysel (trivial). i already tried it with the telnet command and the server responded as desired.
You could just use the socket module.
(Jul-01-2017, 12:27 PM)wavic Wrote: [ -> ]You could just use the socket module.
that's how i would make a plain TCP connection.
https://github.com/python/cpython/blob/2...b2.py#L336 Wrote:
class OpenerDirector:
   def __init__(self):
       client_version = "Python-urllib/%s" % __version__
       self.addheaders = [('User-agent', client_version)]

So the default opener for urllib always includes the user agent, at init.  So there's no option to disable it... but you can just create an opener and clear that header out of it.  Something like this?
import urllib2
opener = urllib2.build_opener()
opener.addheaders = [header for header in opener.addheaders if header[0] != "User-agent"]
urllib2.install_opener(opener)

# now you can use urllib2.open() normally, without the useragent header
data = urllib2.open("https://python-forum.io")