[Hlep]Scrap webiste - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: [Hlep]Scrap webiste (/thread-11686.html) |
[Hlep]Scrap webiste - mr_byte31 - Jul-21-2018 Hi All, I have a website that I need to collect some info from it. I tried to use simple code like this : import urllib.request headers = {} headers['User-Agent'] = "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:48.0) Gecko/20100101 Firefox/48.0" url= 'https://www.biopharmcatalyst.com/calendars/historical-catalyst-calendar' x = urllib.request.Request(url,headers=headers) html = urllib.request.urlopen(x,timeout=10).read()it didn't work. they python program hang ! I tried this as well : import requests url= 'https://www.biopharmcatalyst.com/calendars/historical-catalyst-calendar' url_get = requests.get(url)it also didn't work !!! any idea what is the problem ? RE: [Hlep]Scrap webiste - gontajones - Jul-21-2018 You can use the requests module. RequestsFor an advanced scrapping I suggest you to use the beatifulsoup module. BeautifulSoupTo see the content of the requests in your second script, use .text :import requests url = 'https://www.biopharmcatalyst.com/calendars/historical-catalyst-calendar' url_get = requests.get(url) print(url_get.text) RE: [Hlep]Scrap webiste - Larz60+ - Jul-21-2018 for further reading: Web Scraping part1 Web Scraping part2 |