Python Forum
Custom method to handle exceptions not working as expected
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Custom method to handle exceptions not working as expected
#1
Good afternoon,

I am working on a web scraping project using BeautifulSoup and, on occasion, Selenium webdriver.

The method .findAll(...) by BeautifulSoup always returns a list, but its content is not certain and might be also empty.
I thought to write a custom method to handle IndexError without using a try/except every time I need to scrape something.

        def get_element(element, idx_error=None) -> Any:
            # TODO: This method does not work, idx error is not handled. FIX IT AS SOON AS POSSIBLE!!!
            value_ = None
            try:
                value_ = element

            except IndexError:
                if inspect.isfunction(idx_error):
                    idx_error()
                else:
                    return idx_error
Basically, I want to return value_ ONLY if exists; if the operation raises an IndexError exception, then I want the idx_error value to be returned, or run in case it is a function/method.
These data will be then used for a GUI, therefore I would rather get a non-significant value (ex: 'untitled') rather than a blocking error.

Example usage:
title_selector = content.findAll('h1', attrs={"itemprop": "title"})  # Returns a list of page elements
title = get_element(title_selector[0].text, 'Untitled') # Returns first element in title_selector if there is one; if IndexError returns 'Untitled'
Which should be the equivalent of this:
title_selector = content.findAll('h1', attrs={"itemprop": "title"})
title = None
try:
        title = title_selector[0].text
except IndexError:
        title = 'Untitled'
Unfortunately, for some reason I can't figure out, it just doesn't do anything, and the IndexError is thrown anyways.

Any ideas?
Reply
#2
This won't work because in the line
title = get_element(title_selector[0].text, 'Untitled') # Returns first element in title_selector if there is one; if IndexError returns 'Untitled'
that is where the IndexError will be raised before it gets to the function as it is trying to index the list

You could check if the list is empty
if not title_selector:
    title = 'Untitled'
else:
    title = whatever_you_want_to_do_instead
Reply
#3
(Dec-22-2022, 06:13 PM)Yoriz Wrote: This won't work because in the line
title = get_element(title_selector[0].text, 'Untitled') # Returns first element in title_selector if there is one; if IndexError returns 'Untitled'
that is where the IndexError will be raised before it gets to the function as it is trying to index the list

You could check if the list is empty
if not title_selector:
    title = 'Untitled'
else:
    title = whatever_you_want_to_do_instead

If I got you right, Python throws the IndexError exception in the exact moment it tries to get the argument element of the function. So it doesn't even get inside the function, it just doesn't get "initialised" (perhaps not the best term, but just to express the idea...). That makes sense, thank you for your insight.

I changed my method as follows, and now it works.
def get_element(element: list, idx: int, idx_error):
    try:
        element[idx]
        
    except IndexError:
        warnings.warn(f'Element not found on page.')
        if inspect.isfunction(idx_error):
            idx_error()
            
        else:
            return idx_error
    else:
        return element[idx]
Reply
#4
I don't understand why you would use indexing with find_all(). I would expect code to iterate over the find_all() results, and an index error would not be possible.

Can you post an example of code where indexing is required? In your posted example don't see why you didn't use find() instead of find_all().

But if you did need to use find_all() you could do it like this.
if title_selector := content.findAll('h1', attrs={"itemprop": "title"})
    title = title_selector[0].text
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Simple conditional not working as expected return2sender 8 1,026 Aug-27-2023, 10:39 PM
Last Post: return2sender
  Method works as expected on host machine but not on server gradlon93 4 1,095 Jan-05-2023, 10:41 AM
Last Post: DeaD_EyE
  PiCamera - print exceptions? korenron 2 842 Dec-15-2022, 10:48 PM
Last Post: Larz60+
Exclamation My code is not working as I expected and I don't know why! Marinho 4 1,093 Oct-13-2022, 08:09 PM
Last Post: deanhystad
  Class exceptions DPaul 1 1,301 Mar-11-2022, 09:01 AM
Last Post: Gribouillis
Question Having trouble writing an Enum with a custom __new__ method stevendaprano 3 4,172 Feb-13-2022, 06:37 AM
Last Post: deanhystad
  set and sorted, not working how expected! wtr 2 1,294 Jan-07-2022, 04:53 PM
Last Post: bowlofred
  is this a good way to catch exceptions? korenron 14 4,734 Jul-05-2021, 06:20 PM
Last Post: hussaind
  Python, exceptions KingKhan248 6 3,057 Nov-15-2020, 06:54 AM
Last Post: buran
  Split string between two different delimiters, with exceptions DreamingInsanity 2 2,049 Aug-24-2020, 08:23 AM
Last Post: DreamingInsanity

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020