![]() |
Help extracting text from element - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Help extracting text from element (/thread-26366.html) |
Help extracting text from element - jpdallas - Apr-29-2020 I've tried many different ways but can't seem to extract the product title and price from the following element: <h1 itemprop="name" overrideelementwith="div" class=" _6YOLH _1JtW7 _2VF_A _2OMMP">Classic Fit Solid Wool Suit</h1> <span id="current-price-string" class="_1ds4c">$338.00</span> Thank you in advance for any suggestions. RE: Help extracting text from element - anbu23 - Apr-29-2020 >>> from bs4 import BeautifulSoup >>> html_string='''<h1 itemprop="name" overrideelementwith="div" class=" _6YOLH _1JtW7 _2VF_A _2OMMP">Classic Fit Solid Wool Suit</h1> <span id="current-price-string" class="_1ds4c">$338.00</span>''' >>> soup = BeautifulSoup(html_string, 'html.parser') >>> row = soup.find('span') >>> row <span class="_1ds4c" id="current-price-string">$338.00</span> >>> print(row.get_text()) $338.00 >>> row = soup.find('h1') >>> print(row.get_text()) Classic Fit Solid Wool Suit RE: Help extracting text from element - jpdallas - Apr-29-2020 Thank you! I'm still running into trouble, so I thought I'd post my complete script. Essentially what I'm trying to do is run a script where it checks the price of a suit and lets me know when it's dropped below $400. import requests from bs4 import BeautifulSoup import time import smtplib URL = "https://shop.nordstrom.com/s/peter-millar-classic-fit-solid-wool-suit/4294847/full?origin=category-personalizedsort&breadcrumb=Home%2FMen%2FClothing%2FSuits%20%26%20Separates&fashioncolor=Black&fashionsize=15%3A46r~~42&color=charcoal" headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0"} Wanted_Price = 400 def trackprice(): price = int(float(getprice())) if price > Wanted_Price: diff = price - Wanted_Price print(f"It's still ${diff} too expensive") # else: # print("Cheaper!") if(price < Wanted_Price): send_mail() def getprice(): # page = requests.get(URL, headers=headers) html_string='''<h1 itemprop="name" overrideelementwith="div" class=" _6YOLH _1JtW7 _2VF_A_2OMMP">Classic Fit Solid Wool Suit</h1> <span id="current-price-string" class="_1ds4c">$338.00</span>''' soup = BeautifulSoup(html_string, 'html.parser') row = soup.find('span') row print(row.get_text()) row = soup.find('h1') print(row.get_text()) def send_mail(): server = smtplib.SMTP('smtp.gmail.com', 587) server.ehlo() server.starttls() server.ehlo() server.login('[email protected]', 'password') subject = 'Nordstrom price went wown' body = 'Check link: https://shop.nordstrom.com/s/peter-millar-classic-fit-solid-wool-suit/4294847/full?origin=category-personalizedsort&breadcrumb=Home%2FMen%2FClothing%2FSuits%20%26%20Separates&fashioncolor=Black&fashionsize=15%3A46r~~42&color=charcoal' msg = f"Subject: {subject}\n\n{body}\n\n" server.sendmail( '[email protected]', '[email protected]', msg ) print('Email has been sent') server.quit() if __name__ == "__main__": while True: trackprice() time.sleep(60*60) RE: Help extracting text from element - anbu23 - Apr-29-2020 Can you add python tags? RE: Help extracting text from element - jpdallas - Apr-29-2020 Sorry, I'm not sure what you mean. I'm fairly new to this, so I apologize. I just posted my complete code so you could see it. RE: Help extracting text from element - anbu23 - Apr-29-2020 Check BBC Code for more info on tags. Are you able to get price when you run against URL? What is the trouble you are facing? RE: Help extracting text from element - jpdallas - Apr-29-2020 So when I run my script here's the output. The other scripts I've created on other sites like Amazon would only come back with the desired info - in this case it would be Classic Fit Solid Wool Suit and $338.00. python petermillar3.py $338.00 Classic Fit Solid Wool Suit Traceback (most recent call last): File "petermillar3.py", line 55, in <module> trackprice() File "petermillar3.py", line 11, in trackprice price = int(float(getprice())) TypeError: float() argument must be a string or a number, not 'NoneType' RE: Help extracting text from element - anbu23 - Apr-30-2020 Return is missing in getprice() def getprice(): ... row = soup.find('span') return(row.get_text().strip('$')) |