![]() |
Web Scraping on href text - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Web Scraping on href text (/thread-22461.html) Pages:
1
2
|
RE: Web Scraping on href text - Superzaffo - Nov-15-2019 Ok. No problem. I'm in no hurry. Thank you :-D RE: Web Scraping on href text - Superzaffo - Nov-16-2019 I wrote this code.. From our exaple (Thank you) from bs4 import BeautifulSoup import requests class ScrapeOrchids: def __init__(self): self.main_url = 'http://www.orchidspecies.com/indexe-ep.htm' self.links = {} self.get_initial_list() self.show_links() def get_initial_list(self): baseurl = 'http://www.orchidspecies.com/' response = requests.get(self.main_url) if response.status_code == 200: page = response.content soup = BeautifulSoup(page, 'lxml') # css_select link can be found using browser inspect element, then right click-->Copy-->CSS_Selector for i in soup.select("li"): #print(i.a.text) if 'Epiblastus lancipetalus' in i.a.text: #print(i.a.get('href')) self.links[i.a.text.strip()] = f"{baseurl}{i.a.get('href')}" else: print(f"Problem fetching {self.main_url}") def show_links(self): for key, value in self.links.items(): print(f"{key}: {value}") if __name__ == '__main__': ScrapeOrchids()this is the result and is what I want.Now I need to get the new link and in the page save the image of the orchid in a excel file. :-( |