Apr-20-2018, 06:50 PM
Hello
i am trying to read each file from a directory, then read the text from the pdf file, however its giving me an error message, first below is the code:
so i am not sure if i am using PyPDF2 correclty? any help would be great.
i am trying to read each file from a directory, then read the text from the pdf file, however its giving me an error message, first below is the code:
from PyPDF2 import PdfFileReader, PdfFileWriter import os directory = os.listdir("C:\example") for file in directory: if file.endswith(".pdf"): pfile = open("C:\example\\"+file,'rb') pdfFile = PdfFileReader(open(pfile) page = pdfFile.getPage(0) print(page.extractText())and its giving me this error message:
Error:C:\WSDL>Read_PDF.py
File "C:\WSDL\Read_PDF.py", line 9
pdfFile = PdfFileReader(open(pfile)
^
TabError: inconsistent use of tabs and spaces in indentation
however if i change the code to this:from PyPDF2 import PdfFileReader, PdfFileWriter import os directory = os.listdir("C:\example") for file in directory: if file.endswith(".pdf"): pfile = open("C:\example\\"+file,'rb') print(pfile) #pdfFile = PdfFileReader(open(pfile) #page = pdfFile.getPage(0) #print(page.extractText())this code works, so i am not sure if this PyPDF2 is having issues, i checked if it was installed and it is, i did a pip install PyPDF2 and it says its already installed.
so i am not sure if i am using PyPDF2 correclty? any help would be great.