Python BeautifulSoup IndexError: list index out of range

rhat398 · May-28-2021, 11:30 AM

import requests
from bs4 import BeautifulSoup
import csv
import time

class LightupScraper:

    results = []

    def fetch(self, url):
        print(f'HTTP GET request to URL: {url}', end='')
        res = requests.get(url)
        print(f' | Status Code: {res.status_code}')

        return res

    def save_response(self, res):
        with open('res.html', 'w') as html_file:
            html_file.write(res)

    def load_response(self):
        html = ''

        with open('res.html', 'r') as html_file:
            for line in html_file:
                html += line

            return html

    def parse(self, html):

        content = BeautifulSoup(html, 'lxml')
        titles = [title.text.strip() for title in content.find_all('h4', {'class': 'card-title ols-card-title'})]
        links = [link.find('a')['href'] for link in content.find_all('h4', {'class': 'card-title ols-card-title'})]
        skus = [sku.text for sku in content.find_all('span', {'class': 'productView-info-value ols-card-text--sku'})]
        mpn = [mpn.text.split(':')[-1].strip() for mpn in content.find_all('span', {'class': 'productView-info-name mpn-label ols-card-text--mpn'})]
        details = [ul.find_all('li') for ul in content.find_all('ul', {'class': 'ols-card-text__list'})]
        brand = [''.join([brand.text for brand in detail if 'Brand:' in brand.text]).split(':')[-1].strip() for detail in details]
        base = [''.join([base.text for base in detail if 'Base Type:' in base.text]).split(':')[-1].strip() for detail in details]
        life_hours = [''.join([life_hour.text for life_hour in detail if 'Life Hours:' in life_hour.text]).split(':')[-1].strip() for detail in details]
        lumens = [''.join([lumen.text for lumen in detail if 'Lumens:' in lumen.text]).split(':')[-1].strip() for detail in details]
        warrantys = [''.join([warranty.text for warranty in detail if 'Warranty:' in warranty.text]).split(':')[-1].strip() for detail in details]
        wattages = [''.join([wattage.text for wattage in detail if 'Wattage:' in wattage.text]).split(':')[-1].strip() for detail in details]
        features = [feature.text.split() for feature in content.find_all('span', {'class': 'ols-card-text__list--features'})]
        prices = [price.text for price in content.find_all('span', {'class': 'price price--withoutTax'})]
        #print(prices)



        for feature in features:
            feat = feature

        for item in range(0, len(titles)):
            self.results.append({
                'titles': titles[item],
                'skus': skus[item],
                'mpn': mpn[item],
                'brand': brand[item],
                'base': base[item],
                'life_hours': life_hours[item],
                'lumens': lumens[item],
                'warrantys': warrantys[item],
                'wattages': wattages[item],
                'feature': feat[item],
                'links': links[item],
                'price': prices[item]
            })

    def to_csv(self):
        with open('lightup.csv', 'w', newline='') as csv_file:
            writer = csv.DictWriter(csv_file, fieldnames=self.results[0].keys())
            writer.writeheader()

            for row in self.results:
                writer.writerow(row)

            print('Exported results to lightup.csv')

    def run(self):
        
        page_num = 3

        for page in range(1, page_num + 1):
            base_url = 'https://www.lightup.com/standard-household-lighting.html?p='
            base_url += str(page)
            res = self.fetch(base_url)
            self.parse(res.text)
            #time.sleep(30)

        self.to_csv()
        # html = self.load_response()
        # self.parse(html)
        #self.save_response(html.text)


if __name__ == '__main__':
    scraper = LightupScraper()
    scraper.run()

Error:

Error:File "lightup_scraper.py", line 66, in parse
    'price': prices[item]
IndexError: list index out of range

I tried to scrape the prices but i am getting list index out of range error because the tag which is responsible for price is returning 14 elements and the other tags returning 16 this is because some price tags are different e.g for price per case tag is price price--withoutTax price-per--case and for single product price price--withoutTax.I tried try except block but no luck it gives me whole another list not individual prices i can't get my head around this problem may be someone can give me some pointers to actually make this work.

Daring_T · (This post was last modified: May-28-2021, 09:10 PM by Daring_T.)

There are several ways to fix this, here's a couple of ways I thought of:

Here's a oneliner for line 66

'price': None if item < titles else prices[item]

if you want a function:

def get(list, index, default=None):
	try:
		return list[index]
	except IndexError:
		return default

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Strange ModuleNotFound Error on BeautifulSoup for Python 3.11	Gaberson19	1	3,185	Jul-13-2023, 10:38 AM Last Post: Gaurav_Kumar
	Python BeautifulSoup gives unusable text?	dggo666	0	1,883	Oct-29-2021, 05:12 AM Last Post: dggo666
	IndexError: list index out of range" & "TypeError: The view function f: Flask Web App	joelbeater992	5	5,103	Aug-31-2021, 08:08 PM Last Post: joelbeater992
	Python 3.9 : BeautifulSoup: 'NoneType' object has no attribute 'text'	fudgemasterultra	1	11,095	Mar-03-2021, 09:40 AM Last Post: Larz60+
	Beautifulsoup doesn't scrape page (python 2.7)	Hikki	0	2,657	Aug-01-2020, 05:54 PM Last Post: Hikki
	Python beautifulsoup pagination error	The61	5	4,663	Apr-09-2020, 09:17 PM Last Post: Larz60+
	IndexError: tuple index out of range ?	JohnnyCoffee	4	7,889	Jan-22-2020, 06:54 AM Last Post: JohnnyCoffee
	from List to BeautifulSoup , Homework	RPC	6	9,145	Jul-03-2018, 12:17 AM Last Post: snippsat
	Getting 'list index out of range' while fetching product details using BeautifulSoup?	PrateekG	8	10,365	Jun-06-2018, 12:15 PM Last Post: snippsat
	How to clean html content using BeautifulSoup in Python 3.6?	PrateekG	5	12,758	Apr-27-2018, 01:14 PM Last Post: snippsat

Python BeautifulSoup IndexError: list index out of range

User Panel Messages

Announcements