Use Requests and not urllib,also need a user agent to not get 503.
Will also need Selenium as Amazon(use a lot of JavaScript).
To show a demo with Requests.
This is when Selenium come into the picture,search the forum for this can also look at web-scraping part-2.
Will also need Selenium as Amazon(use a lot of JavaScript).
To show a demo with Requests.
import requests from bs4 import BeautifulSoup url = 'https://www.amazon.com/Advanced-ASP-NET-Core-Security-Vulnerabilities/dp/1484260139/ref=sr_1_1?dchild=1&keywords=Advanced' headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36'} response = requests.get(url, headers=headers) soup = BeautifulSoup(response.content, 'lxml') print(soup.find('title').text)Test:
Output:Amazon.com
>>> response
<Response [200]>
>>> soup.p
<p class="a-last">Sorry, we just need to make sure you're not a robot. For best results, please make sure your browser is accepting cookies.</p>
So now get 200,but as you see now need browser and cooike.This is when Selenium come into the picture,search the forum for this can also look at web-scraping part-2.