Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Can't open Amazon page
#1
Hello,

Here is simple code that get error:
from urllib.request import urlopen
from bs4 import BeautifulSoup

url = 'https://www.amazon.com/Advanced-ASP-NET-Core-Security-Vulnerabilities/dp/1484260139/ref=sr_1_1?dchild=1&keywords=Advanced+ASP.NET+Core+3+Security&qid=1602852997&s=books&sr=1-1.html'

html = urlopen(url)
============ RESTART: /home/pavel/python_code/parse_amazon_url.py ============
Traceback (most recent call last):
File "/home/pavel/python_code/parse_amazon_url.py", line 6, in <module>
html = urlopen(url)


Where is a problem ?

Thanks
Reply
#2
You're going to need to post the entire traceback as the piece you've shown doesn't say what the problem is.
Reply
#3
Use Requests and not urllib,also need a user agent to not get 503.
Will also need Selenium as Amazon(use a lot of JavaScript).

To show a demo with Requests.
import requests
from bs4 import BeautifulSoup

url = 'https://www.amazon.com/Advanced-ASP-NET-Core-Security-Vulnerabilities/dp/1484260139/ref=sr_1_1?dchild=1&keywords=Advanced'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'lxml')
print(soup.find('title').text)
Test:
Output:
Amazon.com >>> response <Response [200]> >>> soup.p <p class="a-last">Sorry, we just need to make sure you're not a robot. For best results, please make sure your browser is accepting cookies.</p>
So now get 200,but as you see now need browser and cooike.
This is when Selenium come into the picture,search the forum for this can also look at web-scraping part-2.
Reply
#4
You can also send cookies with requests. That being said, Selenium may well be the best option.

If you haven't used html_requests, i would recommend looking at that for anything javascript related. Its a great tool and is an in-between with selenium and requests
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  New in Python Amazon Scraping brian1425 1 368 Jul-10-2020, 01:00 PM
Last Post: snippsat
  use Xpath in Python :: libxml2 for a page-to-page skip-setting apollo 2 684 Mar-19-2020, 06:13 PM
Last Post: apollo
  Amazon AWS - how to install the library chatterbot wpaiva 9 745 Feb-01-2020, 08:18 AM
Last Post: brighteningeyes
  Execute search query on Amazon website Pavel_47 7 678 Nov-07-2019, 10:43 AM
Last Post: snippsat
  open a web page by selenium !! evilcode1 3 1,204 Aug-01-2018, 03:05 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020