Sep-26-2021, 11:05 AM
Hi snippsat,
Thank you for your advice about not using soup when looping- I had tried over 30 different methods to get this data and most of them didn't use soup when looping, but by the end of all those failures- I then tried soup and off course that didn't work either! But good to know never to use it for looping.
Also, thank you for teaching me the easier way to call a class. That's soooo much easier than what I've always done. I have seen your method before, but as I'm still learning, I didn't want to try and learn too many variations and confuse myself more.
With regards to parser, I've only ever used one: html.parser.
So I followed your suggestion to use lxml and tried the following code:
Thank you for your advice about not using soup when looping- I had tried over 30 different methods to get this data and most of them didn't use soup when looping, but by the end of all those failures- I then tried soup and off course that didn't work either! But good to know never to use it for looping.
Also, thank you for teaching me the easier way to call a class. That's soooo much easier than what I've always done. I have seen your method before, but as I'm still learning, I didn't want to try and learn too many variations and confuse myself more.
With regards to parser, I've only ever used one: html.parser.
So I followed your suggestion to use lxml and tried the following code:
from bs4 import BeautifulSoup with open(r"out_of_stock2.html", encoding="utf8") as fp: soup = BeautifulSoup(fp, 'lxml') ph1 = soup.find_all('div', class_ ='h-100 pb1-xl pr4-xl pv1 ph1') for item in ph1: mt1_ph1 = item.find('span', class_ = 'w_A w_C w_B mr1 mt1 ph1') if mt1_ph1 is None: print('No data') else: print(mt1_ph1.text)The result it returned:
Output:No data
1-day shipping
You fixed it! Thank you so much. I've for 2 days trying to figure it out- and honestly probably wouldn't have thought of trying your option. Really appreciate it. (Sep-26-2021, 06:36 AM)snippsat Wrote: Should not loop oversoup
object knight2000,as it's not needed and can give unwanted result.
It will depend on parser used,so if i use lxml(recommend) as parser the length will be one.
from bs4 import BeautifulSoup with open(r"out_of_stock2.html", encoding="utf8") as fp: soup = BeautifulSoup(fp, 'lxml') print(len(soup)) mt2 = soup.find('span', class_="w_A w_C w_B mr1 mt1 ph1") if mt2 is None: print('There is no record') else: print (mt2)It's easier to use
Output:1 <span class="w_A w_C w_B mr1 mt1 ph1">1-day shipping</span>class_="w_A w_C w_B mr1 mt1 ph1
than make it a dictionary call.
Then can just copy CSS class from web-site and add one_
.