Python Forum

Full Version: PDf reader
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,
I'm doing a master thesis in Finance so I decided to learn python in order to make things simpler.
I have learned the basics but my question stands in the possibility to python read a folder full of pdf files and search for specific key words.
This is probably a beginner question, but any help regarding the subject would be extremely valuable. Thanks in advance.
In ubuntu linux, there is a command line tool named pdfgrep that may work.
There's PyMuPDF:
PyPi: https://pypi.python.org/pypi/PyMuPDF/1.12.4
GitHub: https://github.com/rk700/PyMuPDF
Documentation: https://pymupdf.readthedocs.io/en/latest/

and PyPDF2:
PyPi: https://pypi.python.org/pypi/PyPDF2/1.26.0
Documentation: http://pythonhosted.org/PyPDF2/

I have used both and they each complement the other