Custom method to handle exceptions not working as expected

gradlon93 · Dec-22-2022, 05:32 PM

Good afternoon,

I am working on a web scraping project using BeautifulSoup and, on occasion, Selenium webdriver.

The method .findAll(...) by BeautifulSoup always returns a list, but its content is not certain and might be also empty.
I thought to write a custom method to handle IndexError without using a try/except every time I need to scrape something.

        def get_element(element, idx_error=None) -> Any:
            # TODO: This method does not work, idx error is not handled. FIX IT AS SOON AS POSSIBLE!!!
            value_ = None
            try:
                value_ = element

            except IndexError:
                if inspect.isfunction(idx_error):
                    idx_error()
                else:
                    return idx_error

Basically, I want to return value_ ONLY if exists; if the operation raises an IndexError exception, then I want the idx_error value to be returned, or run in case it is a function/method.
These data will be then used for a GUI, therefore I would rather get a non-significant value (ex: 'untitled') rather than a blocking error.

Example usage:

title_selector = content.findAll('h1', attrs={"itemprop": "title"})  # Returns a list of page elements
title = get_element(title_selector[0].text, 'Untitled') # Returns first element in title_selector if there is one; if IndexError returns 'Untitled'

Which should be the equivalent of this:

title_selector = content.findAll('h1', attrs={"itemprop": "title"})
title = None
try:
        title = title_selector[0].text
except IndexError:
        title = 'Untitled'

Unfortunately, for some reason I can't figure out, it just doesn't do anything, and the IndexError is thrown anyways.

Any ideas?

**Yoriz** · Dec-22-2022, 06:13 PM

This won't work because in the line

title = get_element(title_selector[0].text, 'Untitled') # Returns first element in title_selector if there is one; if IndexError returns 'Untitled'

that is where the IndexError will be raised before it gets to the function as it is trying to index the list

You could check if the list is empty

if not title_selector:
    title = 'Untitled'
else:
    title = whatever_you_want_to_do_instead

gradlon93 · (This post was last modified: Dec-22-2022, 06:57 PM by gradlon93.)

(Dec-22-2022, 06:13 PM)Yoriz Wrote: This won't work because in the line
title = get_element(title_selector[0].text, 'Untitled') # Returns first element in title_selector if there is one; if IndexError returns 'Untitled'
that is where the IndexError will be raised before it gets to the function as it is trying to index the list

You could check if the list is empty
if not title_selector:
    title = 'Untitled'
else:
    title = whatever_you_want_to_do_instead

If I got you right, Python throws the IndexError exception in the exact moment it tries to get the argument element of the function. So it doesn't even get inside the function, it just doesn't get "initialised" (perhaps not the best term, but just to express the idea...). That makes sense, thank you for your insight.

I changed my method as follows, and now it works.

def get_element(element: list, idx: int, idx_error):
    try:
        element[idx]
        
    except IndexError:
        warnings.warn(f'Element not found on page.')
        if inspect.isfunction(idx_error):
            idx_error()
            
        else:
            return idx_error
    else:
        return element[idx]

**deanhystad** · (This post was last modified: Dec-22-2022, 07:12 PM by deanhystad.)

I don't understand why you would use indexing with find_all(). I would expect code to iterate over the find_all() results, and an index error would not be possible.

Can you post an example of code where indexing is required? In your posted example don't see why you didn't use find() instead of find_all().

But if you did need to use find_all() you could do it like this.

if title_selector := content.findAll('h1', attrs={"itemprop": "title"})
    title = title_selector[0].text

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Python trivial endgame engine is not working as expected	max22	0	617	Feb-24-2024, 04:41 PM Last Post: max22
	Simple conditional not working as expected	return2sender	8	1,114	Aug-27-2023, 10:39 PM Last Post: return2sender
	Method works as expected on host machine but not on server	gradlon93	4	1,187	Jan-05-2023, 10:41 AM Last Post: DeaD_EyE
	PiCamera - print exceptions?	korenron	2	917	Dec-15-2022, 10:48 PM Last Post: Larz60+
	My code is not working as I expected and I don't know why!	Marinho	4	1,175	Oct-13-2022, 08:09 PM Last Post: deanhystad
	Class exceptions	DPaul	1	1,373	Mar-11-2022, 09:01 AM Last Post: Gribouillis
	Having trouble writing an Enum with a custom __new__ method	stevendaprano	3	4,402	Feb-13-2022, 06:37 AM Last Post: deanhystad
	set and sorted, not working how expected!	wtr	2	1,378	Jan-07-2022, 04:53 PM Last Post: bowlofred
	is this a good way to catch exceptions?	korenron	14	4,863	Jul-05-2021, 06:20 PM Last Post: hussaind
	Python, exceptions	KingKhan248	6	3,176	Nov-15-2020, 06:54 AM Last Post: buran

Custom method to handle exceptions not working as expected

User Panel Messages

Announcements