Dec-22-2017, 07:00 AM
I am trying to run a piece of code that converts csv files to xlsx, adds headers and auto-fits columns. Each "block" checks the file name and then does the above for each file. There are 23 files with sizes from 14kB to 437mB. This process is sequential and takes 1449 seconds.
To speed things up, I wanted to use mutlithreading and thereby process all files at the same time and when done, proceed with the rest of the code. So far I have tried three approaches, but in all cases, I do not get the results I aimed for.
1.) The first approach, uses Process. It doesn't wait for all threads to finish and just runs on; it basically runs over itself and doesn't work at all for what I had intended.
2.) The second approach uses Thread, gives error
3.) The last approach was simply appending each to a list and then using a loop to start and join the threads either using Process or Thread :
How can I get my 23 functions to run simultaneously, wait for all processes to finish (in essence, wait for the longest process to complete) ?
1631 seconds = sequentially with or without functions
6903 seconds = with Thread instead of Process
To speed things up, I wanted to use mutlithreading and thereby process all files at the same time and when done, proceed with the rest of the code. So far I have tried three approaches, but in all cases, I do not get the results I aimed for.
1.) The first approach, uses Process. It doesn't wait for all threads to finish and just runs on; it basically runs over itself and doesn't work at all for what I had intended.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
if __name__ = = '__main__' : p01 = Process(target = PR01) . . . p01.start() . . . p21.join() . . . |
Error:CoInitialize has not been called., None, None)
so I add pythoncom.CoInitialize() into the PR functions [and also tried in combination with pythoncom.CoUninitialize()]. Then it runs, but it doesn't seem to run simultaneously, although it does seem to wait for threads to finish. This method however is about 3 times slower than running the code without functions and just straight forward sequentially :1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
p01 = Thread(name = 'PR01' , target = PR01) . . . p01.start() . . . p01.join() . . . |
1 2 3 4 5 6 7 8 9 10 11 12 |
Threads = [] Threads.append(Thread(name = 'PR01' , target = PR01)) . . . for x in Threads: x.start() for x in Threads: x.join() |
1631 seconds = sequentially with or without functions
6903 seconds = with Thread instead of Process