Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Can urlopen be blocked by websites?
#1
Hi, I am trying to scrape a web site with following code. But it comes with a Timeout Error as below.

TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

If I try with other websites, it works.
Wondering if sites can block urlopen?
Is there any way around?
Appreciate any help.

from urllib.request import urlopen
from bs4 import BeautifulSoup

url = "https://www.nseindia.com/option-chain"
html = urlopen(url)

soup = BeautifulSoup(html,'lxml')
type(soup)

title = soup.title
print(title)
Quote
#2
Use Requests and not urllib,also a User agent that site use.
import requests
from bs4 import BeautifulSoup

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36'
}

url = 'https://www.nseindia.com/option-chain'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'lxml')
print(soup.find('title').text)
Output:
NSE - Option Chain
I guess you can parse anything on this site,turn off JavaScripts in browser then reload.
What you see now is what you get with Requests/BS.
This is common problem that Selenium solve.
There are many Threads about this here if search,one with a other stock site.
Quote
#3
Great. It worked. Thank you.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Python program to write into websites for you pythonDEV333 3 283 Jun-08-2020, 12:06 PM
Last Post: pythonDEV333
  Web Scraping Sportsbook Websites Khuber79 16 932 Mar-30-2020, 11:21 PM
Last Post: Khuber79
  prevent getting blocked maneesh7787 3 328 Dec-11-2019, 08:41 AM
Last Post: buran
  Scraping Websites to post on Telegram kobryan 1 513 Oct-19-2019, 07:03 AM
Last Post: metulburr
  Scraping Websites to post on Telegram kobryan 0 557 Oct-09-2019, 04:11 PM
Last Post: kobryan
  SSLCertVerificationError using urllib (urlopen) FalseFact 1 1,742 Mar-31-2019, 08:34 AM
Last Post: snippsat
  Error: module 'urllib' has no attribute 'urlopen' mitmit293 2 6,832 Jan-29-2019, 02:32 PM
Last Post: snippsat
  Scrapping .aspx websites boxingowl88 3 2,629 Oct-10-2018, 05:35 PM
Last Post: stranac
  [Errno11004] Get addrinfo failed with urlopen prashanth0988 2 9,900 Aug-02-2018, 01:41 PM
Last Post: iiooii
  Scrapper for websites stinger 0 808 Jul-20-2018, 02:11 AM
Last Post: stinger

Forum Jump:


Users browsing this thread: 1 Guest(s)