Python Forum
Recommended way to read/create PDF file?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Recommended way to read/create PDF file?
#1
Hello,

I need to either read/fill/sign an existing PDF file, or build one from scratch.

Using this code, the existing PDF file uses fonts that aren't listed in c:\windows\fonts, so I assume they are embedded in the PDF.

I know nothing about working with PDFs, and would like to have your advice about how to proceed.

Should I somehow fill the existing file with my own text + signature PNG, and merge those into a new PDF, or should I create a new PDF from scratch using either the embedded fonts that I'll export somehow or use a close-enough font?

FWIW, I'm using Python 3.7.0 on 32-bit Windows 7.

Thank you.

PS: Incidently,
#python -m pip install pdfrw
from pdfrw import PdfReader
→ ImportError: cannot import name 'PdfReader' from 'pdfrw'
--
Edit: Found the error: I didn't know you couldn't name a Python script the same name used by a Python module (pdfrw.py, here).
Reply
#2
from command line: python -m pip install pdfreader
pypi: https://pypi.org/project/pdfreader/
github: https://github.com/maxpmaxp/pdfreader
Reply
#3
Lightbulb 
There is a great article on the Pythonology website on the best python libraries for this purpose:
https://pythonology.eu/what-is-the-best-...f-library/

The article comes with a tutorial on how to use those libraries to create or edit pdf files
Best of luck
Reply
#4
You could try reportlab for building from scratch.

You will find this helpful: reportlab-userguide.pdf It has a lot of, but not all, information on reportlab.

reportlab allows you to control every aspect of your pdf.

Just a small example for creating a PDF:

# read the reportlab docs
# long but worth it to control every aspect when creating PDFs
from reportlab.pdfgen import canvas # a page or pages
from reportlab.lib.pagesizes import A4  # page size can be anything custom or standard
from reportlab.pdfbase.ttfonts import TTFont # path to fonts
from reportlab.pdfbase import pdfmetrics # font stuff
from reportlab.lib.units import mm # units to use default is 1/72"
from reportlab.lib.colors import pink, green, brown, white, black 
import os

# the chinese fonts to use
fontpath = '/home/pedro/.local/share/fonts/'
ttfFile = os.path.join(fontpath, '萌萌哒情根深种-中文.ttf')
ttfFile2 = os.path.join(fontpath, 'DroidSansFallbackFull.ttf')
pdfmetrics.registerFont(TTFont("Chinese", ttfFile))
pdfmetrics.registerFont(TTFont("Droid", ttfFile2))
def create_pdf():
    pdf_file = '/home/pedro/pdfs/multipage.pdf' 
    can = canvas.Canvas(pdf_file, pagesize=A4)
    can.setTitle("My PDF") # shown in PDF window
    can.setFont('Chinese', 20)
    can.drawString(20, 800, "First Page 第一页")
    can.showPage() # ends the page and creates a new page
    can.setFont('Times-Roman', 20)
    can.drawString(20, 800, "Second Page")
    can.setFont('Chinese', 20)
    can.drawString(40, 700, "第二页")
    can.showPage()
    can.setFont('Times-Roman', 20)
    can.drawString(20, 700, "Third Page")
    can.setFont('Droid', 20)
    can.drawString(40, 600, "第三页")
    can.showPage() 
    can.save()
 
create_pdf()
For extracting some pages from a PDF to a smaller PDF I use PyPDF2. But you can also make a PDF with PyPDF2.

pdfminer is a good module for getting text from PDFs which other PDF modules can't get.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Connecting to Remote Server to read contents of a file ChaitanyaSharma 0 74 11 hours ago
Last Post: ChaitanyaSharma
  Create Choices from .ods file columns cspower 3 638 Dec-28-2023, 09:59 PM
Last Post: deanhystad
  Use PM4PY and create working file thomaskissas33 0 698 Nov-14-2023, 06:53 AM
Last Post: thomaskissas33
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 1,516 Nov-09-2023, 10:56 AM
Last Post: mg24
  Create csv file with 4 columns for process mining thomaskissas33 3 789 Nov-06-2023, 09:36 PM
Last Post: deanhystad
  read file txt on my pc to telegram bot api Tupa 0 1,152 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 1,138 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
  Formatting a date time string read from a csv file DosAtPython 5 1,351 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  How do I read and write a binary file in Python? blackears 6 6,846 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Read csv file with inconsistent delimiter gracenz 2 1,215 Mar-27-2023, 08:59 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020