Python Forum

Full Version: urllib urlopen getting error 400 on 1 specific page
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
So I'm trying to webscrape some info from a page with stock quotes. I'm getting an error 400, which only happens on this page - have tried a range of other sites.

My code look's like this:
from urllib.request import urlopen as uReg
my_url = 'http://www.nasdaqomxnordic.com/aktier'
uClient = uReg(my_url)
Any ideas what would cause just this 1 page to give me an error?
Use Requests not urllib.
>>> import requests
>>> my_url = 'http://www.nasdaqomxnordic.com/aktier'
>>> r = requests.get(my_url)
>>> r.status_code
200
Basic getting title.
import requests
from bs4 import BeautifulSoup

url = 'http://www.nasdaqomxnordic.com/aktier'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
print(soup.find('title').text)
Output:
Shares - share prices for all companies listed on NASDAQ OMX Nordic - Nasdaq
Sites like this us a lot of JavaScripts,so look if there is a API that eg give JSON back.
Plain scraping so may you need Selenium to get JavaScripts content.
use requests:
>>> import requests                                       
>>> my_url = 'http://www.nasdaqomxnordic.com/aktier'      
>>> response = requests.get(my_url, allow_redirects=False)
>>> if response.status_code == 200:                       
...     uClient = response.content                        
... else:                                                 
...     print('Transfer error')                           
...
>>>
Race posting...
Thanks to both of you.

And yes Snipsatt they have an API, but it's expensive and I actually use this mostly to learn another part of programming.