Can urlopen be blocked by websites? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Can urlopen be blocked by websites? (/thread-28613.html) |
Can urlopen be blocked by websites? - peterjv26 - Jul-26-2020 Hi, I am trying to scrape a web site with following code. But it comes with a Timeout Error as below. TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond If I try with other websites, it works. Wondering if sites can block urlopen? Is there any way around? Appreciate any help. from urllib.request import urlopen from bs4 import BeautifulSoup url = "https://www.nseindia.com/option-chain" html = urlopen(url) soup = BeautifulSoup(html,'lxml') type(soup) title = soup.title print(title) RE: Can urlopen be blocked by websites? - snippsat - Jul-26-2020 Use Requests and not urllib,also a User agent that site use. import requests from bs4 import BeautifulSoup headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36' } url = 'https://www.nseindia.com/option-chain' response = requests.get(url, headers=headers) soup = BeautifulSoup(response.content, 'lxml') print(soup.find('title').text) I guess you can parse anything on this site,turn off JavaScripts in browser then reload. What you see now is what you get with Requests/BS. This is common problem that Selenium solve. There are many Threads about this here if search,one with a other stock site. RE: Can urlopen be blocked by websites? - peterjv26 - Jul-26-2020 Great. It worked. Thank you. |