Simple For Loop Question - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Simple For Loop Question (/thread-6451.html) |
Simple For Loop Question - zykbee - Nov-23-2017 The purpose of the program is to go to a website, make a list of the links, then go to each link and send back the canonical tag. More or less, it works fine. However, I do have one issue: when there is no canonical tag, the prog simply moves on instead of printing "none" or something akin. How would I remedy this? any ideas? from bs4 import BeautifulSoup import requests import re site = requests.get('http://www.angelfire.com/comics/gameroom/').text qqq = BeautifulSoup(site, 'html.parser') for item in qqq.findAll('a', attrs={'href': re.compile("^http://")}): listoflinks = (item.get('href').split()) print("link= ", listoflinks) for x in listoflinks: sit = requests.get((x)).text ppp = BeautifulSoup(sit, 'html.parser') for y in ppp.findAll('link',{"rel":"canonical"}): lll = (y.get('href').split()) print(" canonical= ", lll) RE: Simple For Loop Question - zykbee - Nov-23-2017 Nevermind. In case anyone sees this in future, these are the changes I made: from bs4 import BeautifulSoup import requests import re site = requests.get('http://www.angelfire.com/comics/gameroom/').text qqq = BeautifulSoup(site, 'html.parser') for item in qqq.findAll('a', attrs={'href': re.compile("^http://")}): listoflinks = (item.get('href').split()) print("Original Link= ", listoflinks) for x in listoflinks: sit = requests.get((x)).text ppp = BeautifulSoup(sit, 'html.parser') y = ppp.find_all('link',{"rel":"canonical"}) lll = [t.get('href') for t in y] if len(lll) == 0: print("Canonical Link = N/A") else: print("Canonical Link=", lll) RE: Simple For Loop Question - nilamo - Nov-28-2017 Thanks for sharing :) |