Nov-22-2019, 02:20 PM
Hello,
Here is the tag from where I want to extract text fragment (in bold):
<a class="a-link-normal a-text-normal" href="/Cybersecurity-Intelligent-Systems-Reference-Library/dp/3319988417/ref=sr_1_1?keywords=9783319988412&qid=1574431833&sr=8-1">.
Here is my code:
Any suggestions.
Thanks.
Here is the tag from where I want to extract text fragment (in bold):
<a class="a-link-normal a-text-normal" href="/Cybersecurity-Intelligent-Systems-Reference-Library/dp/3319988417/ref=sr_1_1?keywords=9783319988412&qid=1574431833&sr=8-1">.
Here is my code:
import urllib.request from bs4 import BeautifulSoup import re def download(url, user_agent='wswp', num_retries=2): print('Downloading:', url) request = urllib.request.Request(url) request.add_header('User-agent', user_agent) try: html = urllib.request.urlopen(request) except (URLError, HTTPError, ContentTooShortError) as e: print('Download error:', e.reason) html = None if num_retries > 0: if hasattr(e, 'code') and 500 <= e.code < 600: # recursively retry 5xx HTTP errors return download(url, num_retries - 1) return html html = download('http://www.amazon.com/s?k=9783319988412&ref=nb_sb_noss') bs = BeautifulSoup(html.read(), 'lxml') nameList = bs.find_all('a', {'href':re.compile('"/.*(keywords).*')}) print(len(nameList)) for name in nameList: print(name.get_text())Doesn't work.
Any suggestions.
Thanks.