Python Forum
PDF Extract using CSV values
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
PDF Extract using CSV values
#2
Assuming that your csv file looks something like this:
Set Page, Start Page, End Page
2, 4, 8
2, 10, 14
2, 16, 20 
Then this will do what you're asking:
from PyPDF2 import PdfFileReader, PdfFileWriter
 
pdf_file_path = 'document.pdf'
file_base_name = pdf_file_path.replace('.pdf', '')
 
pdf = PdfFileReader(pdf_file_path)
 
pdfWriter = PdfFileWriter()
 
with open ('page values.csv', 'r') as page_values_file :
	page_values_file.readline () # dump the header

	for line in page_values_file :
		page_values = line.strip ().split (',')
		setpage = int (page_values [0])
		startpage = int (page_values [1])
		endpage = int (page_values [2])

		for page_num in range(startpage,endpage):
			pdfWriter.addPage(pdf.getPage(page_num))
 
		with open("%(n)s_subset_%(b)s.pdf" % {'n': format(file_base_name), 'b': setpage }, 'wb') as f:
			pdfWriter.write(f)
atomxkai likes this post
Reply


Messages In This Thread
PDF Extract using CSV values - by atomxkai - Jan-11-2022, 07:03 PM
RE: PDF Extract using CSV values - by BashBedlam - Jan-11-2022, 07:30 PM
RE: PDF Extract using CSV values - by atomxkai - Jan-12-2022, 06:14 PM
RE: PDF Extract using CSV values - by BashBedlam - Jan-12-2022, 06:41 PM
RE: PDF Extract using CSV values - by atomxkai - Jan-12-2022, 09:15 PM
RE: PDF Extract using CSV values - by Pedroski55 - Jan-13-2022, 12:20 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Dataframe extract key values danipyth 0 1,694 Feb-07-2021, 03:52 PM
Last Post: danipyth
  xml.etree.ElementTree extract string values matthias100 2 5,075 Jul-12-2020, 06:02 PM
Last Post: snippsat
  Extract values from array mehtamonita 8 9,417 Apr-18-2017, 02:45 PM
Last Post: mehtamonita

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020