Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
PyPDF2 deprecation problem
#1
hi there ! i am following a tutorial, and here's the code :

import pyttsx3, PyPDF2

from PyPDF2 import PdfReader

pdfreader = PyPDF2.PdfReader(open('book.pdf', 'rb'))
speaker = pyttsx3.init()

for page_num in range(pdfreader.numPages):
    text = pdfreader.getPage(page_num).extractText()
    clean_text = text.strip().replace('\n', ' ')
    print(clean_text)

speaker.save_to_file(clean_text, 'story.mp3')
speaker.runAndWait()

speaker.stop()
but i get this error

Error:
reader.numPages is deprecated and was removed in PyPDF2 3.0.0. Use len(reader.pages) instead.
ok. so if a function is removed, how to find its replacement ?

well... ok, the replacement is len(reader.pages), but if i try to use it like this :

for page_num in len(pdfreader.pages):
i get this error :

Error:
TypeError : 'int' object is not iterable
as a beginner, i am not used to that kind of issue
Reply
#2
compare
(Sep-20-2023, 10:42 AM)gowb0w Wrote: for page_num in range(pdfreader.numPages):

with

(Sep-20-2023, 10:42 AM)gowb0w Wrote: for page_num in len(pdfreader.pages):

why do you skip the range part?

That said, there are other depreciated parts. Also iterating over pdfreader.pages allows you to work directly with page, no need to use index to get the page object
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
I don't know how to definitively answer your question. There may be (probably is) a better way.
Since packages are contributed, the extent of documentstion is user controlled.

The following are some things that can help.
You can look at the 'changelog' posted in PyPi if there is one available.
When you look up a package, all versions will be displayed. Look to see if there is a newer version
(this doesn't help find depreciated code, but it may (contributor dependent)

In this instance,
PyPDF2 has been replaced with what appears to be a complete rewrite,
see: PyPDF3.16.1 Released: Sep 17, 2023
Please also see:
Analyzing PyPI package downloads
And Google Big Query
Reply
#4
I often want to get a range of pages from a pdf file. One day I got exactly this error, but it was easily fixed.

This worked for me last time I tried.

from PyPDF2 import PdfWriter, PdfReader
import os

pathToPDF = input('something like /home/pedro/Latin/ ... ')
path2Extracts = '/home/pedro/pdfExtractedPages/'
# get the names of the files available to extract from
files = os.listdir(pathToPDF)
# show the files in a loop so you can choose 1
# I haven't done that here
# choose a PDF from a list of PDFs from  as bookname
bookTitle = bookname.replace('.pdf', '')
# read the pdf
pdf = PdfReader(path2PDF + bookname)
#pages = pdf.getNumPages() (deprecated)
pages = len(pdf.pages)
print('This pdf has ' + str(pages) + ' pages')
print('What pages do you want to get?')
startnum = input('what is the starting page number?  ')
print('If your last page is page 76, enter 76 for the end number')
endnum = input('what is the last page number?  ')
start = int(startnum) - 1
end = int(endnum)
# only need to open pdfWriter 1 time
pdf_writer = PdfWriter()
for page in range(start, end):
        pdf_writer.add_page(pdf.pages[page])
        
print('Enter the savename for this pdf, like CE3U8')
savename = input('Enter the name to save this pdf under, like CE3U8 No need to add .pdf ... ')
output_filename = savename + '.pdf'

with open(path2Extracts + output_filename, 'wb') as out:
        pdf_writer.write(out)
print(f'Created: {output_filename} and saved in', path2Extracts)
print('All done!')
Reply
#5
(Sep-20-2023, 12:32 PM)Pedroski55 Wrote: I often want to get a range of pages from a pdf file. One day I got exactly this error, but it was easily fixed.

This worked for me last time I tried.

from PyPDF2 import PdfWriter, PdfReader
import os

pathToPDF = input('something like /home/pedro/Latin/ ... ')
path2Extracts = '/home/pedro/pdfExtractedPages/'
# get the names of the files available to extract from
files = os.listdir(pathToPDF)
# show the files in a loop so you can choose 1
# I haven't done that here
# choose a PDF from a list of PDFs from  as bookname
bookTitle = bookname.replace('.pdf', '')
# read the pdf
pdf = PdfReader(path2PDF + bookname)
#pages = pdf.getNumPages() (deprecated)
pages = len(pdf.pages)
print('This pdf has ' + str(pages) + ' pages')
print('What pages do you want to get?')
startnum = input('what is the starting page number?  ')
print('If your last page is page 76, enter 76 for the end number')
endnum = input('what is the last page number?  ')
start = int(startnum) - 1
end = int(endnum)
# only need to open pdfWriter 1 time
pdf_writer = PdfWriter()
for page in range(start, end):
        pdf_writer.add_page(pdf.pages[page])
        
print('Enter the savename for this pdf, like CE3U8')
savename = input('Enter the name to save this pdf under, like CE3U8 No need to add .pdf ... ')
output_filename = savename + '.pdf'

with open(path2Extracts + output_filename, 'wb') as out:
        pdf_writer.write(out)
print(f'Created: {output_filename} and saved in', path2Extracts)
print('All done!')

so good ! thank you for your example ! i'm currently studying it to modify my own code.

btw, for the letter before a string, 'f', 'r' or 's'

can you explain the difference between those, to me please ?
Reply
#6
Better ask the experts, I'm not too good at this.

I know f works like this

var1 = 'beautiful girl'
print(f'I love a {var1}.')

I have read that you can put a function between the {} which makes it very flexible!

r' reads a string as bytes, I believe. that helps avoid characters which may need to be escaped, I think.

s' I don't know. The old way of formatting strings used %s

But just search Python f' or Python r' or Python s'
gowb0w likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  how to avoid deprecation notice merrittr 5 1,167 Nov-27-2023, 11:12 PM
Last Post: rob101
  ModuleNotFoundError: No module named 'PyPDF2' Benitta2525 1 1,525 Aug-07-2023, 05:32 AM
Last Post: DPaul
  Pypdf2 will not find text standenman 2 947 Feb-03-2023, 10:52 PM
Last Post: standenman
  pyPDF2 PDFMerger close pensding file japo85 2 2,437 Jul-28-2022, 09:49 AM
Last Post: japo85
  PyPDF2 processing problem Pavel_47 6 9,786 May-04-2021, 06:58 AM
Last Post: chaitanya
  Problem with installing PyPDF2 Pavel_47 2 6,041 Nov-10-2019, 02:58 PM
Last Post: Pavel_47
  wrap_text with openpyxl. How to use documentation to resolve deprecation warning? curranjohn46 4 14,428 Oct-09-2019, 01:04 PM
Last Post: curranjohn46
  pyPDF2 nautilus columns modification AJBek 1 2,917 Jun-07-2019, 04:17 PM
Last Post: micseydel
  Using Pypdf2 write a string to a pdf file Pedroski55 6 20,347 Apr-11-2019, 11:10 PM
Last Post: snippsat
  Merging pdfs with PyPDF2 Pedroski55 0 3,296 Mar-07-2019, 11:58 PM
Last Post: Pedroski55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020