Nov-16-2019, 10:52 AM
(This post was last modified: Nov-16-2019, 10:52 AM by Superzaffo.)
I wrote this code.. From our exaple (Thank you)
Now I need to get the new link and in the page save the image of the orchid in a excel file. :-(
from bs4 import BeautifulSoup import requests class ScrapeOrchids: def __init__(self): self.main_url = 'http://www.orchidspecies.com/indexe-ep.htm' self.links = {} self.get_initial_list() self.show_links() def get_initial_list(self): baseurl = 'http://www.orchidspecies.com/' response = requests.get(self.main_url) if response.status_code == 200: page = response.content soup = BeautifulSoup(page, 'lxml') # css_select link can be found using browser inspect element, then right click-->Copy-->CSS_Selector for i in soup.select("li"): #print(i.a.text) if 'Epiblastus lancipetalus' in i.a.text: #print(i.a.get('href')) self.links[i.a.text.strip()] = f"{baseurl}{i.a.get('href')}" else: print(f"Problem fetching {self.main_url}") def show_links(self): for key, value in self.links.items(): print(f"{key}: {value}") if __name__ == '__main__': ScrapeOrchids()this is the result
Output:Epiblastus lancipetalus Schltr. 1911: http://www.orchidspecies.com/epiblancipetalus.htm
and is what I want.Now I need to get the new link and in the page save the image of the orchid in a excel file. :-(