Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
OCR again
#1
Big Grin 
Hi,
I am refining , = making more efficient, my programs for
OCR-ing Prayer Cards. I do not know what PCs look like in your countries,
but imagine a small, foulded document = it has 4 printable sides.
Only one of those sides has the content we need to OCR. (all sides are scanned)
So I crop the document into 4 pieces, and examine them one by one,
until I find where the money is. That works, but I could optimize.
Questions: (a "page" is 1/4 of the PC)
1) Sometimes a page is blank, nothing on it.
2) Sometimes a page is completely covered with an image(not half..., completely)
OCR spends time trying to find specific words on those pages, when it finds none, it goes to the next .
How could I determine that a page is blank, or fully covered with colors or grey values ,
so I can skip it right away? (Preferably not by examining the RGB value of each pixel) Wink
Any ideas?,
Paul
It is more important to do the right thing, than to do the thing right.(P.Drucker)
Better is the enemy of good. (Montesquieu) = French version for 'kiss'.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020