Help to web scrape from 2 diffrent sources

Help to web scrape from 2 diffrent sources - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Help to web scrape from 2 diffrent sources (/thread-39119.html)

Help to web scrape from 2 diffrent sources - Extra - Jan-05-2023

Hello,

I have an Amazon Price Tracker program that takes links from Amazon and prints out the current price and a message that states whether I should buy it or not according to the budget price I set in my database (The database stores: ItemName, ItemLink, AlertPrice).

The problem is that it only recognizes Amazon links but not links from other websites like Walmart.ca.

How would I get my code to work with other sites like Walmart?

Thanks in advance.

import requests
from bs4 import BeautifulSoup
import sqlite3
from rich import print

#Initializing Currency Symbols to substract it from our string
currency_symbols = ['€', '	£', '$', "¥", "HK$", "₹", "¥", "," ] 

headers = {
'authority': 'www.amazon.com',
'pragma': 'no-cache',
'cache-control': 'no-cache',
'dnt': '1',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (X11; CrOS x86_64 8172.45.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.64 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'none',
'sec-fetch-mode': 'navigate',
'sec-fetch-dest': 'document',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
}

#------------------------------------------
#            Get Price of Products
#------------------------------------------
#Get the price of each product
def get_price(URL):
    response = requests.get(URL, headers=headers)
    soup = BeautifulSoup(response.content, "html.parser")

    #Finding the elements
    product_title = soup.find('span', class_ = "a-size-large product-title-word-break").getText()
    product_price = soup.find('span', class_ = "a-offscreen").getText()

    # using replace() to remove currency symbols
    for i in currency_symbols : 
        product_price = product_price.replace(i,'')

    ProductTitleStrip = product_title.strip()
    ProductPriceStrip = product_price.strip()
    print("[bright_yellow]"+ProductTitleStrip)
    print("[bright_cyan]$" + ProductPriceStrip)

    #Converting the string to integer
    product_price = int(float(product_price))
    return(product_price)
#------------------------------------------


#------------------------------------------
#            Get Products to Track
#------------------------------------------
#Connect to the database
connection = sqlite3.connect('ProductTrackerDatabase.db')
cursor = connection.cursor()

for Product_Name, URL, my_price in cursor.execute("SELECT Product, URL, Alert_Price FROM AmazonPriceTracker"):
    current_price = get_price(URL)
    if current_price < float(my_price):
        print("[green]You Can Buy This Now!\n")
    else:
        print("[red]The Price Is Too High\n")

connection.close() #Close the connection
#------------------------------------------

This outputs 5 products from Amazon and the last item (the error) is linked to Walmart (The Link)

Output:MSI Gaming Geforce GTX 1660 Super 192-bit HDMI/DP 6GB GDRR6 HDCP Support DirectX 12 Dual Fan VR Ready OC Graphics Card
$401.81
The Price Is Too High

Western Digital 2TB WD Blue 3D NAND Internal PC SSD - SATA III 6 Gb/s, 2.5"/7mm, Up to 560 MB/s - WDS200T2B0A
$224.99
The Price Is Too High

12V 3000mAh Monitors Large Capacity Rechargeable Li-ion Storage Battery
$23.26
You Can Buy This Now!

ZOTAC Gaming GeForce GTX 1660 6GB GDDR5 192-bit Gaming Graphics Card, Super Compact, ZT-T16600K-10M
$355.00
The Price Is Too High

Hobart 770726 Shade 5, Mirrored Lens Safety Glasses
$45.51
The Price Is Too High

Traceback (most recent call last):
  File "C:\Users\BX-PC\Downloads\Python Programs\Amazon Price Tracker\AmazonPriceTracker.py", line 58, in <module>
    current_price = get_price(URL)
  File "C:\Users\BX-PC\Downloads\Python Programs\Amazon Price Tracker\AmazonPriceTracker.py", line 32, in get_price
    product_title = soup.find('span', class_ = "a-size-large product-title-word-break").getText()
AttributeError: 'NoneType' object has no attribute 'getText'