Apr-20-2018, 06:50 PM
Hello
i am trying to read each file from a directory, then read the text from the pdf file, however its giving me an error message, first below is the code:
and its giving me this error message:
this code works, so i am not sure if this PyPDF2 is having issues, i checked if it was installed and it is, i did a pip install PyPDF2 and it says its already installed.
so i am not sure if i am using PyPDF2 correclty? any help would be great.
i am trying to read each file from a directory, then read the text from the pdf file, however its giving me an error message, first below is the code:
1 2 3 4 5 6 7 8 9 10 11 |
from PyPDF2 import PdfFileReader, PdfFileWriter import os directory = os.listdir( "C:\example" ) for file in directory: if file .endswith( ".pdf" ): pfile = open ( "C:\example\\" + file , 'rb' ) pdfFile = PdfFileReader( open (pfile) page = pdfFile.getPage( 0 ) print (page.extractText()) |
Error:C:\WSDL>Read_PDF.py
File "C:\WSDL\Read_PDF.py", line 9
pdfFile = PdfFileReader(open(pfile)
^
TabError: inconsistent use of tabs and spaces in indentation
however if i change the code to this:1 2 3 4 5 6 7 8 9 10 11 12 |
from PyPDF2 import PdfFileReader, PdfFileWriter import os directory = os.listdir( "C:\example" ) for file in directory: if file .endswith( ".pdf" ): pfile = open ( "C:\example\\" + file , 'rb' ) print (pfile) #pdfFile = PdfFileReader(open(pfile) #page = pdfFile.getPage(0) #print(page.extractText()) |
so i am not sure if i am using PyPDF2 correclty? any help would be great.