Python Forum

Full Version: How can I ignore empty fields when scrapping
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm gathering Card data for a game I like. some cards don't have the certain text and others do, the script searches for it and when it comes across a card with missing data it just breaks and doesn't continue.

I need to ignore a few sections, not all cards have weakness/flavor text/evolving etc

There are about 230 cards to scrape, the code below stops after 6, as the 7th card on the page doesn't have "flavor" text. if I comment out "Flavour" it scrapes about 126 cards, as the card it stops on doesn't have "att" or "weak" etc.

So I need to tell the script, if you come across something missing, just ignore it and move on. But I don't know how to do this.

Here is my code
from bs4 import BeautifulSoup
import requests, openpyxl


excel = openpyxl.Workbook()
print(excel.sheetnames)
sheet = excel.active
sheet.title = "Pokemon Cards"
print(excel.sheetnames)
sheet.append(['title', 'slug', 'sku', 'category_id', 'price', 'discount_rate', 'vat_rate', 'stock',	'description', 'image_url', 'external_link'])


try:
    source = requests.get('https://pkmncards.com/set/chilling-reign/?sort=date&ord=auto&display=full')
    source.raise_for_status()

    soup = BeautifulSoup(source.text, 'html.parser')

    cards = soup.find_all(class_="entry")

    for card in cards:
        #title = card.find(class_="card-title")
        title = card.find('h2').span.text
        details = card.find(class_='card-tabs').text
        image_url = card.find(class_='card-image-area').a
        price = card.find(class_='m').span.text
        name = card.find(class_='name-hp-color').text
        att = card.find(class_="tab").find(class_="text").text
        evol = card.find(class_='type-evolves-is').text
        weak = card.find(class_='weak-resist-retreat')
        ill = card.find(class_='illus minor-text').text
        release = card.find(class_='release-meta minor-text').text
        stan = card.find(class_='mark-formats minor-text').text
        flavor = card.find(class_='flavor minor-text').text
        slug = ""
        sku = ""
        category_id = "50"
        discount_rate = ""
        vat_rate = ""
        stock = "4"
        external_link = ""
        #description1 = "<b>Card Name</b> " + name + " <br> " + evol + " <br> " + att + " <br> " + weak + " <br> " + ill + " <br> " + release + " <br> "+ stan + " <br> " + " <br><br> " + "All Prices are subject to change please message me for more details."
        #description = description1.replace("Pokémon", "Pokemon").replace("×", "x").replace(" → ", " > ").replace("⇢", ">").replace("↘", ">").replace(" · ", " - ").replace(" › ", " > ").replace("’", "'")

        print(title)

        #print(title, description, image_url.get('href'), price)
        #sheet.append([title, slug, sku, category_id, price, discount_rate, vat_rate, stock, description, image_url.get('href'), external_link])


except Exception as e:
    print(e)


#excel.save('chill_all4.xlsx')