Python Forum

Full Version: MultiThreading
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
Not a showstopper or a bug, just a nice to have:
I basically use the multithreading code, proposed by Wavic:

images = pathlib.Path('path_to_folder').glob('**/*.tif') # recusively. returns a generator
with concurrent.futures.ThreadPoolExecutor() as executor:
    _ = executor.map(worker, images)
Before that, Ii did it old_school, by reading the 'images" sequentially in a "for loop".
Obvioulsly very much slower.
BUT: by inserting a counter inside the for loop, i could monitor it's progress (500 done, 1000 done, 1500 done...etc.
You can't insert a counter in the "worker" function, because printing is erratic,
and how would a multiThreading count work.? Via a global variable ? = one big mess.
Although I hinted to this problem in another post, nobody seems to know.
So my definitive question: how does one monitor the "progress" of aThreadPool, given
that "images" are scans that need to be processed.
If a user starts a batch of 3000, does he/she have the time to go and get a coffee, before the batch is finished?
If it is not possible, OK with me. Plan B.
Paul
OK, don't bother, i found that it is possible, but hardly kiss:
"... This can be achieved by issuing tasks asynchronously to the ThreadPool,
such as via the apply_async() function
and specifying a callback function via the “callback” argument...."


Unless there is a kiss solution, that I am not aware of,...
Plan B is to do some tests for each type of document, calculate the average
processing time (pro rata) for a batch of 10, and predict the endtime.
Won't be a minute wrong. Smile
Paul