Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Image Processing in Python
#1
Hello all

I wanted to get the communities advice on a python project I am considering on embarking on.

I have a PDF file that contains approx 50 pages each page containing a graph/plot.

Along the top and bottom of the page is text and various values which is used to isolate a specific page and portion of the plot.

I would like to import the PDF say an image, using a function pass in some text and some numbers, python would then loop through each page of the PDF until it finds the text and numbers.

Once it finds it, some text and a circle is placed at the location of where the text/numbers were found.

Is this something that is even possible?

I would love to hear your thoughts?

Thanks
Reply
#2
Here's something to whet your appetite:

PDF's are a pain in general.

One reason has to do with images that are from photographs, which may or may not include text.
The text from these types of pages is notoriously difficult to extract, but can (with extreme effort) be done with some OCR (optical character recognition) software. There are packages available for free that do this, but they are only mediocre at best. Even most of the Commercial products struggle with this type of content.

Pure converted text is much easier to deal with (caveat: may require extensive tweaking to get text positions on some documents), especially if the text is arranged in tables.

There are several packages available to deal with this type of conversion.
I have used most all of them.

The most common are:
The graphics portion is actually the easier (not not easy) part.

I would consider wxpython or PyQt5 as the graphics package
wxpython is 100% free, and the latest (phoenix) version is very robust

Qt5 is arguably the most advanced graphics package for python, but royalty and other fees may be involved with
some commercial uses.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Upload image to Instagram using python/selenium using image URL failed, need help greenpine 5 5,341 Feb-22-2022, 09:42 PM
Last Post: snippsat
  mysql.connector.errors.ProgrammingError: Failed processing format-parameters; Python ilknurg 3 5,467 Jan-18-2022, 06:25 PM
Last Post: ilknurg
  SimpleHTTPRequestHandler ( verses ) Files Python Processing JohnnyCoffee 0 1,750 Apr-29-2021, 02:47 AM
Last Post: JohnnyCoffee
  Real Time Audio Processing with Python Sound-Device not working Slartybartfast 2 3,885 Mar-14-2021, 07:20 PM
Last Post: Slartybartfast
  PIL Image / python-resize-image AttributeError sallyjc81 1 4,907 Aug-02-2020, 12:06 AM
Last Post: scidam
  Array problem in pylab module - Image processing bobfat 0 1,685 Dec-31-2019, 06:02 PM
Last Post: bobfat
  python one line file processing har 4 3,204 Dec-09-2019, 06:10 AM
Last Post: har
  How to do real-time audio signal processing using python Zenolen 7 16,271 Nov-04-2019, 02:57 AM
Last Post: jefsummers
  Maze Mapping with Image Processing furkankuyumcu 0 2,162 Dec-16-2018, 02:45 PM
Last Post: furkankuyumcu
  Parallel Processing in Python with Robot crcali 6 5,103 Apr-06-2018, 03:48 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020