Python Forum
How to compare two PDFs for differences - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: How to compare two PDFs for differences (/thread-25282.html)



How to compare two PDFs for differences - Normanie - Mar-25-2020

Hello,

I receive two PDFs daily from two different sources. Theoretically they're supposed to have the exact same numbers - but I would like to create an automated report which confirms this and notifies me of any issues.

Unfortunately, certain 'titles' within the reports differ ever so slightly, which makes recognition on this basis difficult.

For example, they may have the same balances, but one might be called "duck USD" while the other just called "duck" (example names).

Bearing in mind I'm relatively new when it comes to Python, what road could I go down in order to create this automation?




To give an example of the style of layout:

Report 1
Dog 50 2,000 5,000
Cat 80 5,000 10,000

Report 2
Dog USD 50 2,000 5,000
Cat EUR 80 5,000 10,000


RE: How to compare two PDFs for differences - donmed777 - Jul-28-2020

Can you explain more what you want to do?


RE: How to compare two PDFs for differences - millpond - Jul-30-2020

This has a pdf to text function:
https://pypi.org/project/pdf/
I would convert to txt and run a system file compare (until I learn to use the python functions for that!)

Better is this:
https://pypi.org/project/pdf-diff3/
Which will display the diff in a png image.