Python Forum
Search text in PDF and output its page number.
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Search text in PDF and output its page number.
#21
Can pdfplumber search part of a word then print results with the whole word?

example:

search word: Page (

Output:
search word: Page (1) found on page 1
search word: Page (2) found on page 2
search word: Page (3) found on page 3
...
Reply
#22
(Jan-21-2022, 03:51 AM)atomxkai Wrote: Can pdfplumber search part of a word then print results with the whole word?
It's more up to you to do that task as pdfplumber return plaint text.
So for this task can use regex.
Eg a pattern(search) r"\bpage\s\d+\b" will find page 1,page 2 or page 50.
Also it find page \s(whitespace character) \d(matches a digit) +(matches the previous digit between one and unlimited times)
Example.
import pdfplumber
import re

pdf_file = "sample.pdf"
pattern = re.compile(r"\bpage\s\d+\b")
with pdfplumber.open(pdf_file) as pdf:
    pages = pdf.pages
    for page_nr, pg in enumerate(pages, 1):
        content = pg.extract_text()
        for match in pattern.finditer(content):
            print(match.group(), page_nr, content.index(match.group()))
Output:
page 2 1 568 page 1 2 39
atomxkai likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Brick Number stored as text with openpyxl CAD79 1 216 Mar-19-2024, 08:31 PM
Last Post: deanhystad
  capturing multiline output for number of parameters jss 3 766 Sep-01-2023, 05:42 PM
Last Post: jss
  Formatting float number output barryjo 2 879 May-04-2023, 02:04 PM
Last Post: barryjo
  fuzzywuzzy search string in text file marfer 9 4,431 Aug-03-2021, 02:41 AM
Last Post: deanhystad
  Getting a GET request output text into a variable to work with it. LeoT 2 2,884 Feb-24-2021, 02:05 PM
Last Post: LeoT
  Increment text files output and limit contains Kaminsky 1 3,135 Jan-30-2021, 06:58 PM
Last Post: bowlofred
  How to Split Output Audio on Text to Speech Code Base12 2 6,784 Aug-29-2020, 03:23 AM
Last Post: Base12
  Search Results Web results Printing the number of days in a given month and year afefDXCTN 1 2,190 Aug-21-2020, 12:20 PM
Last Post: DeaD_EyE
  Import Text, output curve geometry Alyner 0 1,931 Feb-03-2020, 03:05 AM
Last Post: Alyner
  Search for the line number corresponding to a value Lali 0 1,620 Oct-22-2019, 08:56 AM
Last Post: Lali

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020