Python Forum
a couple of big project i am making plans for - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: General (https://python-forum.io/forum-1.html)
+--- Forum: News and Discussions (https://python-forum.io/forum-31.html)
+--- Thread: a couple of big project i am making plans for (/thread-35671.html)



a couple of big project i am making plans for - Skaperen - Nov-30-2021

actually there are 2 closely related projects.

number 1 is an optical character finder mechanism. it would scam the image of a page to detect where text is located on the page and what font size it is (in units related to page). closely positioned letters can be formed into words, sentences, and more. these chunks of text would be saved with properties like size and position, ready for searches among many of these. text may be laid out in structures that don't make it easy to make an ASCII page from. but this would still make it possible to search for text in old historical books i have big image collections of.

number 2 is to reproduce the book pages with clean text fonts scaled to make a replacement image that makes it easier to read, along with images that are not text. a lot of these books were published before 1900. or maybe not even published at all, like old church documents as early as 1600. the text scanning does not need to be perfect. there are places i don't see any possibility. many of the images are rotated slightly and this will be corrected so all pages are lined up.

i need to find any tools intended for image scanning and matching as well as optical character recognition implementations. i wonder what Python programmers do when working heavily at the pixel level, especially with gradient color or simple black and white bits like most of these. do they do all this in Python or drop down to using C? then i'll need to decide what i want to do with this.

another unrelated project is to build a GUI application to draw Mandelbrot set and Julia set images with an array of calculation engines to spread the work over many cores, systems, and/or cloud instances, even with a mix of architectures. the actual calculation engine will be done in C but the work load managers that pretend to be an engine and spread the load among many other engines will be done in Python as will the display GUI application. but i do wonder if anyone has gotten some high speed Mandelbrot calculation to run in Python. probably one of the first things i need to do is design the network protocol this will use. i know this kind of thing has been done as i saw a physical demo of 25 PCs doing it way back in 1992 (when everything was 1 core, a lot slower, most likely not in Python, and there was no cloud). networking was barely around, then (social media, definitely not).