Python Forum
Recommended way to read/create PDF file?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Recommended way to read/create PDF file?
#1
Hello,

I need to either read/fill/sign an existing PDF file, or build one from scratch.

Using this code, the existing PDF file uses fonts that aren't listed in c:\windows\fonts, so I assume they are embedded in the PDF.

I know nothing about working with PDFs, and would like to have your advice about how to proceed.

Should I somehow fill the existing file with my own text + signature PNG, and merge those into a new PDF, or should I create a new PDF from scratch using either the embedded fonts that I'll export somehow or use a close-enough font?

FWIW, I'm using Python 3.7.0 on 32-bit Windows 7.

Thank you.

PS: Incidently,
#python -m pip install pdfrw
from pdfrw import PdfReader
→ ImportError: cannot import name 'PdfReader' from 'pdfrw'
--
Edit: Found the error: I didn't know you couldn't name a Python script the same name used by a Python module (pdfrw.py, here).
Reply
#2
from command line: python -m pip install pdfreader
pypi: https://pypi.org/project/pdfreader/
github: https://github.com/maxpmaxp/pdfreader
Reply
#3
Lightbulb 
There is a great article on the Pythonology website on the best python libraries for this purpose:
https://pythonology.eu/what-is-the-best-...f-library/

The article comes with a tutorial on how to use those libraries to create or edit pdf files
Best of luck
Reply
#4
You could try reportlab for building from scratch.

You will find this helpful: reportlab-userguide.pdf It has a lot of, but not all, information on reportlab.

reportlab allows you to control every aspect of your pdf.

Just a small example for creating a PDF:

# read the reportlab docs
# long but worth it to control every aspect when creating PDFs
from reportlab.pdfgen import canvas # a page or pages
from reportlab.lib.pagesizes import A4  # page size can be anything custom or standard
from reportlab.pdfbase.ttfonts import TTFont # path to fonts
from reportlab.pdfbase import pdfmetrics # font stuff
from reportlab.lib.units import mm # units to use default is 1/72"
from reportlab.lib.colors import pink, green, brown, white, black 
import os

# the chinese fonts to use
fontpath = '/home/pedro/.local/share/fonts/'
ttfFile = os.path.join(fontpath, '萌萌哒情根深种-中文.ttf')
ttfFile2 = os.path.join(fontpath, 'DroidSansFallbackFull.ttf')
pdfmetrics.registerFont(TTFont("Chinese", ttfFile))
pdfmetrics.registerFont(TTFont("Droid", ttfFile2))
def create_pdf():
    pdf_file = '/home/pedro/pdfs/multipage.pdf' 
    can = canvas.Canvas(pdf_file, pagesize=A4)
    can.setTitle("My PDF") # shown in PDF window
    can.setFont('Chinese', 20)
    can.drawString(20, 800, "First Page 第一页")
    can.showPage() # ends the page and creates a new page
    can.setFont('Times-Roman', 20)
    can.drawString(20, 800, "Second Page")
    can.setFont('Chinese', 20)
    can.drawString(40, 700, "第二页")
    can.showPage()
    can.setFont('Times-Roman', 20)
    can.drawString(20, 700, "Third Page")
    can.setFont('Droid', 20)
    can.drawString(40, 600, "第三页")
    can.showPage() 
    can.save()
 
create_pdf()
For extracting some pages from a PDF to a smaller PDF I use PyPDF2. But you can also make a PDF with PyPDF2.

pdfminer is a good module for getting text from PDFs which other PDF modules can't get.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Create Choices from .ods file columns cspower 3 520 Dec-28-2023, 09:59 PM
Last Post: deanhystad
  Use PM4PY and create working file thomaskissas33 0 573 Nov-14-2023, 06:53 AM
Last Post: thomaskissas33
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,314 Nov-09-2023, 10:56 AM
Last Post: mg24
  Create csv file with 4 columns for process mining thomaskissas33 3 695 Nov-06-2023, 09:36 PM
Last Post: deanhystad
  read file txt on my pc to telegram bot api Tupa 0 1,052 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 1,044 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Formatting a date time string read from a csv file DosAtPython 5 1,162 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  How do I read and write a binary file in Python? blackears 6 6,019 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Read csv file with inconsistent delimiter gracenz 2 1,149 Mar-27-2023, 08:59 PM
Last Post: deanhystad
  create exe file for linux? korenron 2 911 Mar-22-2023, 01:42 PM
Last Post: korenron

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020