Sep-26-2021, 03:11 AM
Hi Guys,
After trying to figure this one out for over 8 hours, I thought I would get a fresh perspective from someone.
I'm practicing some web scrapping and I've got a scenario where I've got a pretty easy goal: I'm trying to find an object and if it exists, extract some data from it (shipping information) and if it doesn't exist, enter something like " " (...because I'm going to be using pandas- so I need to do something when it can't find the object, else I know I'll get the "ValueError Arrays Must be All Same Length" error).
I've tried many things to do this, but I'm unable to successfully:
1) capture where the object doesn't exist; and
2) accurately get data from when the object does exist.
My current reiteration of the code is:
Additionally, there is one record in the dataset that doesn't contain the object but the code output ignores my print statement ('There is no record')
Could someone please shed some light on what I'm doing incorrectly?
Thank you.
After trying to figure this one out for over 8 hours, I thought I would get a fresh perspective from someone.
I'm practicing some web scrapping and I've got a scenario where I've got a pretty easy goal: I'm trying to find an object and if it exists, extract some data from it (shipping information) and if it doesn't exist, enter something like " " (...because I'm going to be using pandas- so I need to do something when it can't find the object, else I know I'll get the "ValueError Arrays Must be All Same Length" error).
I've tried many things to do this, but I'm unable to successfully:
1) capture where the object doesn't exist; and
2) accurately get data from when the object does exist.
My current reiteration of the code is:
from bs4 import BeautifulSoup with open("out_of_stock2.html", encoding="utf8") as fp: soup = BeautifulSoup(fp, 'html.parser') for item in soup: mt2 = soup.find('span', {'class': 'w_A w_C w_B mr1 mt1 ph1'}) if mt2 is None: print('There is no record') else: print (mt2)When I run this, I get:
Output:<span class="w_A w_C w_B mr1 mt1 ph1">1-day shipping</span>
<span class="w_A w_C w_B mr1 mt1 ph1">1-day shipping</span>
<span class="w_A w_C w_B mr1 mt1 ph1">1-day shipping</span>
I'm not sure why I'm getting 3 instances of this when the data only contains 1? (The object I'm looking for is "w_A w_C w_B mr1 mt1 ph1")Additionally, there is one record in the dataset that doesn't contain the object but the code output ignores my print statement ('There is no record')
Could someone please shed some light on what I'm doing incorrectly?
Thank you.
Attached Files
out_of_stock2.html (Size: 7.61 KB / Downloads: 251)