Custom method to handle exceptions not working as expected - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Custom method to handle exceptions not working as expected (/thread-39029.html) |
Custom method to handle exceptions not working as expected - gradlon93 - Dec-22-2022 Good afternoon, I am working on a web scraping project using BeautifulSoup and, on occasion, Selenium webdriver. The method .findAll(...) by BeautifulSoup always returns a list, but its content is not certain and might be also empty. I thought to write a custom method to handle IndexError without using a try/except every time I need to scrape something. def get_element(element, idx_error=None) -> Any: # TODO: This method does not work, idx error is not handled. FIX IT AS SOON AS POSSIBLE!!! value_ = None try: value_ = element except IndexError: if inspect.isfunction(idx_error): idx_error() else: return idx_errorBasically, I want to return value_ ONLY if exists; if the operation raises an IndexError exception, then I want the idx_error value to be returned, or run in case it is a function/method.These data will be then used for a GUI, therefore I would rather get a non-significant value (ex: 'untitled' ) rather than a blocking error.Example usage: title_selector = content.findAll('h1', attrs={"itemprop": "title"}) # Returns a list of page elements title = get_element(title_selector[0].text, 'Untitled') # Returns first element in title_selector if there is one; if IndexError returns 'Untitled'Which should be the equivalent of this: title_selector = content.findAll('h1', attrs={"itemprop": "title"}) title = None try: title = title_selector[0].text except IndexError: title = 'Untitled'Unfortunately, for some reason I can't figure out, it just doesn't do anything, and the IndexError is thrown anyways. Any ideas? RE: Custom method to handle exceptions not working as expected - Yoriz - Dec-22-2022 This won't work because in the line title = get_element(title_selector[0].text, 'Untitled') # Returns first element in title_selector if there is one; if IndexError returns 'Untitled'that is where the IndexError will be raised before it gets to the function as it is trying to index the listYou could check if the list is empty if not title_selector: title = 'Untitled' else: title = whatever_you_want_to_do_instead RE: Custom method to handle exceptions not working as expected - gradlon93 - Dec-22-2022 (Dec-22-2022, 06:13 PM)Yoriz Wrote: This won't work because in the line If I got you right, Python throws the IndexError exception in the exact moment it tries to get the argument element of the function. So it doesn't even get inside the function, it just doesn't get "initialised" (perhaps not the best term, but just to express the idea...). That makes sense, thank you for your insight. I changed my method as follows, and now it works. def get_element(element: list, idx: int, idx_error): try: element[idx] except IndexError: warnings.warn(f'Element not found on page.') if inspect.isfunction(idx_error): idx_error() else: return idx_error else: return element[idx] RE: Custom method to handle exceptions not working as expected - deanhystad - Dec-22-2022 I don't understand why you would use indexing with find_all(). I would expect code to iterate over the find_all() results, and an index error would not be possible. Can you post an example of code where indexing is required? In your posted example don't see why you didn't use find() instead of find_all(). But if you did need to use find_all() you could do it like this. if title_selector := content.findAll('h1', attrs={"itemprop": "title"}) title = title_selector[0].text |