Python Forum

Full Version: Web Scraper with BeautifulSoup4 sometimes no Output
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

I need some help with my Python-Script:
I want to make a web scraper to scrape some prices from this website:
https://www.medizinfuchs.de/?params%5Bse...h_cat%5D=1
I wrote following code:
from bs4 import BeautifulSoup
import requests

URL = "https://www.medizinfuchs.de/?params%5Bsearch%5D=10714367&params%5Bsearch_cat%5D=1"

my_headers = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36", "Accept":"text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,image/apng,*/*;q=0.8"}
  
page = requests.get(URL, headers=my_headers)

soup = BeautifulSoup(page.text, "lxml")

for price in soup.select("ul.Apothekenliste div.price"):
        print(float(price.text.strip(' \t\n€').replace(',', '.')))
It works sometimes - but too inconsistent.
I really don't know what I should be doing different.

Thanks for your help!
Well, for me it does not seem to work at all. Dodgy
It work something but not stable,like this it's a little more stable.
It may bye more stable using Selenium in headless mode,
but code under should work ok.
from bs4 import BeautifulSoup
import requests
import time

URL = "https://www.medizinfuchs.de/?params%5Bsearch%5D=10714367&params%5Bsearch_cat%5D=1"
my_headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,image/apng,*/*;q=0.8",
}
page = requests.get(URL, headers=my_headers)
soup = BeautifulSoup(page.text, "lxml")
time.sleep(5)
price_lst = soup.find_all('div', class_="col-xs-24 single")
for price in price_lst:
    print(price.text.strip())
 
Output:
6,81 € 6,84 € 7,14 € 7,23 € 7,36 € 7,39 € 7,53 € 7,60 €
I think is want continuous checker over time should use something like schedule,or set it up schedule at OS level.