Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
PDf reader
#1
Hello,
I'm doing a master thesis in Finance so I decided to learn python in order to make things simpler.
I have learned the basics but my question stands in the possibility to python read a folder full of pdf files and search for specific key words.
This is probably a beginner question, but any help regarding the subject would be extremely valuable. Thanks in advance.
Reply
#2
In ubuntu linux, there is a command line tool named pdfgrep that may work.
Reply
#3
There's PyMuPDF:
PyPi: https://pypi.python.org/pypi/PyMuPDF/1.12.4
GitHub: https://github.com/rk700/PyMuPDF
Documentation: https://pymupdf.readthedocs.io/en/latest/

and PyPDF2:
PyPi: https://pypi.python.org/pypi/PyPDF2/1.26.0
Documentation: http://pythonhosted.org/PyPDF2/

I have used both and they each complement the other
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  xml simple reader kucingkembar 2 1,052 Aug-19-2022, 08:51 PM
Last Post: kucingkembar
  Having strange results from an RFID HID card reader - I'm stuck orbisnz 1 1,477 Mar-28-2022, 08:20 AM
Last Post: Larz60+
  Thoughts on interfacing with a QR code reader that outputs keystrokes? wrybread 1 1,468 Oct-08-2021, 03:44 PM
Last Post: bowlofred
  NFC reader code help johnroberts2k 1 2,570 Jul-02-2021, 08:43 PM
Last Post: deanhystad
  csv.reader(): Limit the number of columns read in Windows Pedroski55 9 5,190 Jan-23-2021, 01:03 AM
Last Post: pjfarley3
  Closing Files - CSV Reader/Writer lummers 2 2,605 May-28-2020, 06:36 AM
Last Post: Knight18
  csv reader kgiles 3 5,338 Nov-05-2019, 09:04 AM
Last Post: perfringo
  G code reader luisfelipepc 2 3,678 Aug-13-2018, 02:56 AM
Last Post: ichabod801
  Python code for gcode reader and representation ralmeida 1 6,243 Jul-31-2018, 09:20 AM
Last Post: DeaD_EyE
  AttributeError: module 'csv' has no attribute 'reader' python1234 2 26,939 Jun-08-2018, 06:13 AM
Last Post: python1234

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020