Python Forum
Unable to fetch product url using BeautifulSoup with Python3.6
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Unable to fetch product url using BeautifulSoup with Python3.6
#1
Hi Expert,

I have fetched data from html using below code-
def get_soup(url):
    response = requests.get(url)
    html = response.content
    return BeautifulSoup(html, "html.parser")
And I have fecthed catagory url with-
def get_category_urls(url):
    soup = get_soup(url)
    cat_urls = []
    try:
        categories = soup.find('div', attrs={'id': 'menu_oc'})
        if categories is not None:
            for c in categories.findAll('a'):
                if c['href'] is not None:
                    cat_urls.append(c['href'])
    except Exception as exc:
        print("error::" + url + str(exc))
    finally:
        return cat_urls
Now I am trying to fetch product urls with below code-
def get_product_urls(url):
    soup = get_soup(url)
    prod_urls = []
    try:
        if soup.find('div', attrs={'class': 'pagination'}):
            pages = soup.find('div', attrs={'class': 'page'}).text.split("of ", 1)[1].replace(' (1 Pages)','')
            if pages is not None:
                for page in range(1, int(pages) + 1):
                    soup_with_page = get_soup(url + "&page={}".format(page))
                    product_urls_soup = soup_with_page.find('div', attrs={'id': 'carousel-featured-0'})
                    if product_urls_soup is not None:
                        for row in product_urls_soup.findAll('a'):
                            if row['href'] is not None:
                                prod_urls.append(row['href'])
    except Exception as exc:
        print("error:: " + prod_urls + ": " + str(exc))
    finally:
        return prod_urls
if __name__ == '__main__':
    with Pool(2) as p:
        product_urls = p.map(get_product_urls, category_urls)
    product_urls = list(filter(None, product_urls))
    product_urls_flat = list(set([y for x in product_urls for y in x]))
I am getting product_urls_soup as None here, what I am doing wrong here? PFB sample html data-

html data

How to handle pagination here since some categoroies have pagination and some have not?

Finally I got the issue.
I was not checking pagination for all categories and that's why getting problem.
Now I am able to solve the issue by putting a check for pagination.
Reply


Messages In This Thread
Unable to fetch product url using BeautifulSoup with Python3.6 - by PrateekG - Jun-05-2018, 06:29 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Unable to convert browser generated xml to parse in BeautifulSoup Nik1811 0 354 Mar-22-2024, 01:37 PM
Last Post: Nik1811
  All product links to products on a website MarionStorm 0 1,113 Jun-02-2022, 11:17 PM
Last Post: MarionStorm
  Help with python3 (BeautifulSoup) freaknez 1 3,042 Sep-14-2018, 09:50 PM
Last Post: Larz60+
  My Django 2.0.6 logging is not working while product merging PrateekG 0 2,195 Jul-26-2018, 02:24 PM
Last Post: PrateekG
  Need help to get product details using BeautifulSoup+Python3.6! PrateekG 2 2,915 Jun-27-2018, 08:52 AM
Last Post: PrateekG
  How to fetch latitude,longitude from location and save them separately in db(Django2) PrateekG 0 2,670 Jun-21-2018, 04:40 AM
Last Post: PrateekG
  Getting 'list index out of range' while fetching product details using BeautifulSoup? PrateekG 8 8,259 Jun-06-2018, 12:15 PM
Last Post: snippsat
  Not able to fetch data from a webpage sumandas89 3 4,795 Dec-21-2017, 08:30 AM
Last Post: sumandas89
  How do I fetch values from db to Select Options using Flask? progShubham 2 17,793 Jul-25-2017, 05:52 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020