Python Forum
urllib urlopen getting error 400 on 1 specific page
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
urllib urlopen getting error 400 on 1 specific page
#1
So I'm trying to webscrape some info from a page with stock quotes. I'm getting an error 400, which only happens on this page - have tried a range of other sites.

My code look's like this:
from urllib.request import urlopen as uReg
my_url = 'http://www.nasdaqomxnordic.com/aktier'
uClient = uReg(my_url)
Any ideas what would cause just this 1 page to give me an error?
Reply
#2
Use Requests not urllib.
>>> import requests
>>> my_url = 'http://www.nasdaqomxnordic.com/aktier'
>>> r = requests.get(my_url)
>>> r.status_code
200
Basic getting title.
import requests
from bs4 import BeautifulSoup

url = 'http://www.nasdaqomxnordic.com/aktier'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
print(soup.find('title').text)
Output:
Shares - share prices for all companies listed on NASDAQ OMX Nordic - Nasdaq
Sites like this us a lot of JavaScripts,so look if there is a API that eg give JSON back.
Plain scraping so may you need Selenium to get JavaScripts content.
Reply
#3
use requests:
>>> import requests                                       
>>> my_url = 'http://www.nasdaqomxnordic.com/aktier'      
>>> response = requests.get(my_url, allow_redirects=False)
>>> if response.status_code == 200:                       
...     uClient = response.content                        
... else:                                                 
...     print('Transfer error')                           
...
>>>
Reply
#4
Race posting...
Reply
#5
Thanks to both of you.

And yes Snipsatt they have an API, but it's expensive and I actually use this mostly to learn another part of programming.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  I am scraping a web page but got an Error Sarmad54 3 1,417 Mar-02-2023, 08:20 PM
Last Post: Sarmad54
  Getting from <td> tag by using urllib,Beautifulsoup KuroBuster 2 2,027 Aug-20-2021, 07:53 AM
Last Post: KuroBuster
  Beautiful Soap can't find a specific section on the page Pavel_47 1 2,384 Jan-18-2021, 02:18 PM
Last Post: snippsat
  Can urlopen be blocked by websites? peterjv26 2 3,320 Jul-26-2020, 06:45 PM
Last Post: peterjv26
  Beginner: urllib error tomfry 7 6,466 May-03-2020, 04:35 AM
Last Post: Larz60+
  use Xpath in Python :: libxml2 for a page-to-page skip-setting apollo 2 3,578 Mar-19-2020, 06:13 PM
Last Post: apollo
  urllib.error.HTTPError: HTTP Error 404: Not Found ckkkkk 4 8,630 Mar-03-2020, 11:30 AM
Last Post: snippsat
  SSLCertVerificationError using urllib (urlopen) FalseFact 1 5,833 Mar-31-2019, 08:34 AM
Last Post: snippsat
  Error: module 'urllib' has no attribute 'urlopen' mitmit293 2 14,950 Jan-29-2019, 02:32 PM
Last Post: snippsat
  How to read what's written in THIS specific page ? pfpietro 1 2,334 Sep-13-2018, 05:16 PM
Last Post: stranac

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020