Handling exception from a module

dchi2 · Nov-23-2019, 09:18 AM

Very new to python, grew up on PHP. I'm using the pdftitle module, for its intended purpose, don't seem to be able to gracefully handle it throwing exceptions.

Exceptions I've come across are either recursion limit or "pdfminer.pdffont.PDFUnicodeNotDefined". I'm happy to just skip the documents where these occur but have been unable to. Not sure if the cause us "During handling of the above exception, another exception occurred:" or overall nesting from the module?

try:
    PdfTitle = pdftitle.run(FilePath)
except:
    print(FilePath)
    print("an exception occurred")

Expected result - file name and "an exception occurred" are printed, actual result is the exception output:

Traceback (most recent call last):
  File "C:\Program Files (x86)\Python38-32\lib\site-packages\pdfminer\pdffont.py", line 580, in to_unichr
    return self.cid2unicode[cid]
KeyError: 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Program Files (x86)\Python38-32\lib\s
...
  File "C:\Program Files (x86)\Python38-32\lib\site-packages\pdfminer\pdffont.py", line 582, in to_unichr
    raise PDFUnicodeNotDefined(None, cid)
pdfminer.pdffont.PDFUnicodeNotDefined: (None, 1)

**Larz60+** · Nov-23-2019, 09:23 AM

use:

try:
    PdfTitle = pdftitle.run(FilePath)
except KeyError:
    print(f"Key error encountered: {FilePath}")
    raise

The raise will allow exceptions other that KeyError to cause error, so PDFUnicodeNotDefined will still raise exception.
You can capture that one as well if so desired.

dchi2 · Nov-23-2019, 10:51 PM

Sorry, I wasn't clear in my post, I want to catch all exceptions when running that module, which is what I expect an "except:" clause with no exception name to do.

I did now try using "except Keyerror" but the result did not change.

**Larz60+** · (This post was last modified: Nov-23-2019, 11:12 PM by Larz60+.)

It's not except Keyerror:
it's: except KeyError:
case sensitive

To catch all exceptions, write it like (untested):

import sys


try:
    PdfTitle = pdftitle.run(FilePath)
except:
    print(f"Unexpected exception: {sys.exc_info()[0]}")

dchi2 · Nov-23-2019, 11:35 PM

Sorry that was just a typo, I did enter it as KeyError.

This works as expected, only output is "an exception occurred":

try:
    PdfTitle = 1 / 0
except:
    print("an exception occurred")

This does not catch the exception, it is still raised and the "an exception occurred" is not output:

try:
    PdfTitle = pdftitle.run(FilePath)
except:
    print("an exception occurred")

**Larz60+** · Nov-23-2019, 11:50 PM

try post 4

dchi2 · Nov-23-2019, 11:58 PM

I don't see how there's any difference to what I had in post 1 or 5 aside from the print text, but have done so just in case and still does not catch the exception.

**Larz60+** · Nov-24-2019, 03:32 AM

You're doing something wrong.
please show all of your code.

dchi2 · Nov-24-2019, 05:23 AM

The if statement on line 23 is there because of the uncaught exceptions, I wanted to suppress the exception text and go straight to the else on line 26

import argparse, os, pdftitle, re

parser = argparse.ArgumentParser(description='Generate filenames from PDF titles.')
parser.add_argument('path', help='Starting folder path')
parser.add_argument('-r', '--rename', action='store_true', help='Rename files (otherwise just display)')
args = parser.parse_args()

def pdf_recurse(SrcFolder):

    for FileName in os.listdir(SrcFolder):
        FilePath = SrcFolder + '\\' + FileName
        if os.path.isdir(FilePath):
            pdf_recurse(FilePath)
        else:
            FileExt = FilePath[-3:]
            if FileExt.lower() == 'pdf':                
                PdfTitle = ""
                try:
                    PdfTitle = pdftitle.run(FilePath)
                except:                    
                    print(FilePath)
                    print("an exception occurred")
                if PdfTitle == "" or PdfTitle == 1:
                    print(FilePath)
                    print("Could not read")
                else:
                    NewName = new_name(PdfTitle, FileName)
                    if NewName != "" and NewName != FileName:
                        print(FilePath)
                        print(NewName)
                        if args.rename:
                            os.rename(r'' + str(FilePath), r'' + SrcFolder + '\\' + NewName)

def new_name(ReadTitle, FileName):
    if len(ReadTitle) < 6:
        return ""
    Match = re.search('(iptc|spe)[\s\-]{0,1}[0-9]+' , FileName, re.IGNORECASE)
    if Match != None:
        return ""
    if len(ReadTitle) > 72:
        ReadTitle = ReadTitle[:72]
    NewName = re.sub('[^\w_.)( -]', '', ReadTitle) + '.pdf'
    return NewName

pdf_recurse(args.path)

**Larz60+** · Nov-24-2019, 08:31 AM

The following worked for me (changes on line 1, 22, 24):

import argparse, os, pdftitle, re, sys

 
parser = argparse.ArgumentParser(description='Generate filenames from PDF titles.')
parser.add_argument('path', help='Starting folder path')
parser.add_argument('-r', '--rename', action='store_true', help='Rename files (otherwise just display)')
args = parser.parse_args()
 
def pdf_recurse(SrcFolder):
 
    for FileName in os.listdir(SrcFolder):
        FilePath = SrcFolder + '\\' + FileName
        if os.path.isdir(FilePath):
            pdf_recurse(FilePath)
        else:
            FileExt = FilePath[-3:]
            if FileExt.lower() == 'pdf':                
                PdfTitle = ""
                try:
                    PdfTitle = pdftitle.run(FilePath)
                except:
                    print(f"Unexpected exception: {sys.exc_info()[0]}")
                    print(FilePath)
                    # print("an exception occurred")
                if PdfTitle == "" or PdfTitle == 1:
                    print(FilePath)
                    print("Could not read")
                else:
                    NewName = new_name(PdfTitle, FileName)
                    if NewName != "" and NewName != FileName:
                        print(FilePath)
                        print(NewName)
                        if args.rename:
                            os.rename(r'' + str(FilePath), r'' + SrcFolder + '\\' + NewName)
 
def new_name(ReadTitle, FileName):
    if len(ReadTitle) < 6:
        return ""
    Match = re.search('(iptc|spe)[\s\-]{0,1}[0-9]+' , FileName, re.IGNORECASE)
    if Match != None:
        return ""
    if len(ReadTitle) > 72:
        ReadTitle = ReadTitle[:72]
    NewName = re.sub('[^\w_.)( -]', '', ReadTitle) + '.pdf'
    return NewName
 
pdf_recurse(args.path)

results (replaced part of path to protect privacy):

Output:Unexpected exception: <class 'TypeError'>
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\AnsoniaNew.pdf
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\AnsoniaNew.pdf
Could not read
Unexpected exception: <class 'TypeError'>
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\Ansonia.pdf
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\Ansonia.pdf
Could not read
Unexpected exception: <class 'TypeError'>
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\AshfordNew.pdf
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\AshfordNew.pdf
Could not read
Unexpected exception: <class 'TypeError'>
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\Andover.pdf
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\Andover.pdf
Could not read
Unexpected exception: <class 'TypeError'>
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\Ashford.pdf
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\Ashford.pdf
Could not read
Unexpected exception: <class 'TypeError'>
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\AndoverNew.pdf
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\AndoverNew.pdf
Could not read
Unexpected exception: <class 'TypeError'>
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\AvonNew.pdf
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\AvonNew.pdf
Could not read
Unexpected exception: <class 'TypeError'>
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\Avon.pdf
/.../pdf/OCR+ConvertedNov4_2014TownElectionReturns\Avon.pdf
Could not read

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	python exception handling handling .... with traceback	mg24	3	3,356	Nov-09-2022, 07:29 PM Last Post: Gribouillis
	TicTacToe Game Add Exception Handling and Warning Function	ShaikhShaikh	5	3,548	Nov-03-2021, 05:02 PM Last Post: deanhystad
	Error handling using cmd module	leifeng	3	4,269	Jun-06-2020, 06:25 PM Last Post: leifeng
	Exception handling in regex using python	ShruthiLS	1	2,879	May-04-2020, 08:12 AM Last Post: anbu23
	Exception handling	Calli	2	3,193	Apr-20-2020, 06:13 PM Last Post: Calli
	problem using custom exception handling in python	srm	3	3,756	Jul-03-2019, 09:10 PM Last Post: ichabod801
	an easy way to disable exception handling	Skaperen	6	7,764	Jun-02-2019, 10:38 PM Last Post: Gribouillis
	exception handling	KyawMyo	3	3,583	May-07-2019, 07:53 AM Last Post: buran
	Database operation exception handling	LostInCode	1	2,991	Jan-03-2019, 07:50 PM Last Post: jeanMichelBain
	During handling of the above exception, another exception occurred	Skaperen	7	30,272	Dec-21-2018, 10:58 AM Last Post: Gribouillis

Handling exception from a module

User Panel Messages

Announcements