Any way to remove HTML tags from scraped data? (I want text only) - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Any way to remove HTML tags from scraped data? (I want text only) (/thread-30714.html) |
Any way to remove HTML tags from scraped data? (I want text only) - SeBz2020uk - Nov-02-2020 Hello everyone! I would like to thank you in advance for looking at my thread, and trying to resolve my issue. What I'm trying to achieve, is to scrape the current value of gold (in ounces) from a website. However, my code pulls the data correctly, but it displays the HTML tags in the printed results. I've spent countless hours Googling to try and fix this, but I cannot resolve it. You guys are my only hope haha! Here's my code I'm using to scrape with: #imports required modules import requests from bs4 import BeautifulSoup #requests html page to parse page = requests.get("https://www.bullionbypost.co.uk/") #parses page and stores it in the 'soup' variable soup = BeautifulSoup(page.content, 'html.parser') #searches for tags in the HTML results = soup.find_all("span", {"class": "gold-price-per-ounce"}) #prints results from the executed code above print(results)This is what the program returns: Like I mentioned earlier, the desired results would be to print the text string containing the value of gold (without the HTML tags).Thanks again! RE: Any way to remove HTML tags from scraped data? (I want text only) - Larz60+ - Nov-02-2020 #imports required modules import requests from bs4 import BeautifulSoup #requests html page to parse page = requests.get("https://www.bullionbypost.co.uk/") #parses page and stores it in the 'soup' variable soup = BeautifulSoup(page.content, 'html.parser') #searches for tags in the HTML results = soup.find_all("span", {"class": "gold-price-per-ounce"}) print(f"Gold price per ounce is: {results[0].text}") #prints results from the executed code above # print(results) |