Python Forum
Splitt PDF at regex value
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Splitt PDF at regex value
#9
OK. I will try it.

(Jun-13-2023, 07:14 PM)deanhystad Wrote: What "gives you this stuff"?

Those are not error messages. Is your program printing something, or are this a message from PyPDF2? When are they printed? Are these output when you try to extract text from a page?

I would try something like this to diagnose.
import re
from PyPDF2 import PdfReader

date_regex = re.compile(r"Visit: (\d{2}/\d{2}/\d{4})")


def split_pdf_by_date(pdf_path):
    pdf = PdfReader(pdf_path)
    for pagenum, page in enumerate(pdf.pages, start=1):
        print("Page", pagenum)
        text = page.extract_text()
        print("Search")
        print(re.search(date_regex, text), "\n")


split_pdf_by_date("Test.pdf")
Reply


Messages In This Thread
Splitt PDF at regex value - by standenman - Jun-13-2023, 12:39 PM
RE: Splitt PDF at regex value - by deanhystad - Jun-13-2023, 01:42 PM
RE: Splitt PDF at regex value - by standenman - Jun-13-2023, 02:41 PM
RE: Splitt PDF at regex value - by deanhystad - Jun-13-2023, 04:58 PM
RE: Splitt PDF at regex value - by standenman - Jun-13-2023, 06:00 PM
RE: Splitt PDF at regex value - by deanhystad - Jun-13-2023, 06:14 PM
RE: Splitt PDF at regex value - by standenman - Jun-13-2023, 07:03 PM
RE: Splitt PDF at regex value - by deanhystad - Jun-13-2023, 07:14 PM
RE: Splitt PDF at regex value - by standenman - Jun-13-2023, 07:16 PM
RE: Splitt PDF at regex value - by standenman - Jun-13-2023, 09:37 PM
RE: Splitt PDF at regex value - by deanhystad - Jun-14-2023, 12:18 PM
RE: Splitt PDF at regex value - by Pedroski55 - Jul-11-2023, 01:25 AM

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020