Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
processes shall be parallel
#1
Hi,

how can I get this multi processing example working in the easiest way?

I 've got 4 processes, each process lasts 2 seconds.

How can I get the for processes get finished in less than 8 seconds?
(They shall be processed parallel.)

Thanks a lot for your help...

from multiprocessing import Process
import time


def cpu_extensive():
    time.sleep(2)
    print('Done')


def main():

    # define processes
    p1 = Process(target=cpu_extensive())
    p1.start()
    p2 = Process(target=cpu_extensive())
    p2.start()
    p3 = Process(target=cpu_extensive())
    p3.start()
    p4 = Process(target=cpu_extensive())
    p4.start()

    p1.join()
    p2.join()
    p3.join()
    p4.join()


if __name__ == '__main__':
    start_measuring = time.time()
    main()
    end_measuring = time.time()

    t = end_measuring - start_measuring
    print(t)
Reply
#2
This executes the function.
cpu_extensive()
So you are executing "cpu_extensive()", then creating the process and setting target=None.
Instead use:
p1 = Process(target=cpu_extensive)
For something like this you might want to look at Pool.
from multiprocessing import Pool
import time
 
def cpu_extensive(i):
    time.sleep(2)
    print(i, 'Done')

if __name__ == "__main__":
    starttime = time.time()
    pool = Pool()
    pool.map(cpu_extensive, range(4))
    pool.close()
    endtime = time.time()
    print(f"Time taken {endtime-starttime} seconds")
Reply
#3
Hi deanhystad,

thanks a lot for your great answer!!

Example 1:
from multiprocessing import Process
import time


def cpu_extensive():
    time.sleep(2)
    print('Done')


def main():

    # define processes
    for i in range(5):
        # process_name = p1, p2, p3...
        process_name = "{}{}".format("p", i)
        process_name = Process(target=cpu_extensive)
        process_name.start()

    for i in range(5):
        process_name.join()


if __name__ == '__main__':
    start_measuring = time.time()
    main()
    end_measuring = time.time()

    t = end_measuring - start_measuring
    print(t)
Example 2:
from multiprocessing import Pool
import time


def cpu_extensive(i):
    time.sleep(2)
    print(i, 'Done')


if __name__ == "__main__":
    starttime = time.time()
    pool = Pool()
    pool.map(cpu_extensive, range(4))
    pool.close()
    endtime = time.time()
    print(f"Time taken {endtime - starttime} seconds")
I simulate a load which takes 2 seconds, 4 times... This is processed parallel.

Example 1 and 2 both lasts approximately 2,18 seconds.

I'm planning to deal 40000 pictures with the pHash and want to use multiprocessing to speed up the processing.

Perhaps you have an idea, if Example 1 or 2 is more suitable for this load...

Thank you very much for your help!!

flash77
Reply
#4
At most multi-processing is only going to increase you speed x4 or so. Probably less. You might want to look at multiple tasks depending on if the process is I/O or processor intensive.
Reply
#5
The slowest is IO. To get benefits from multiprocessing, IO must be fast enough.
I do not have significant differences (15s vs 12s) between 1 Process and 4 Processes.
He still needs to read the Data from SSD.

I'm not confident, if a mmap is the fastest possible solution to hash a file.

from hashlib import md5
from pathlib import Path
from multiprocessing import Pool
from mmap import ACCESS_READ, mmap

EMPTY = md5().hexdigest()

def hasher(file: Path) -> str:
    if file.stat().st_size == 0:
        return EMPTY

    with file.open("rb") as fd:
        with mmap(fd.fileno(), 0, access=ACCESS_READ) as mm:
            print(md5(mm).hexdigest(), file, sep="  ")


def main(glob):
    files = [element for element in Path().rglob(glob) if element.is_file()]
    with Pool(4) as pool:
        pool.map(hasher, files)


if __name__ == "__main__":
    main("*.pdf")
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Sharing imported modules with Sub Processes? Stubblemonster 2 1,513 May-02-2022, 06:42 AM
Last Post: Stubblemonster
  Killing processes via python Lavina 2 2,633 Aug-04-2021, 06:20 AM
Last Post: warnerarc
  How to share a numpy array between 2 processes on Windows? qstdy 0 2,170 Jan-29-2021, 04:24 AM
Last Post: qstdy
  sharing variables between two processes Kiyoshi767 1 1,877 Nov-07-2020, 04:00 AM
Last Post: ndc85430
  2 or more processes on the write end of the same pipe Skaperen 4 3,886 Sep-27-2020, 06:41 PM
Last Post: Skaperen
  Errors using --processes parameter sonhospa 3 2,406 Jul-01-2020, 02:24 PM
Last Post: sonhospa
  cv2.resize(...) shutting down processes? DreamingInsanity 1 2,272 Dec-18-2019, 04:06 PM
Last Post: DreamingInsanity
  waiting for many processes in parallel Skaperen 2 1,903 Sep-02-2019, 02:20 AM
Last Post: Skaperen
  how to clean up unstarted processes? Skaperen 2 2,257 Aug-27-2019, 05:37 AM
Last Post: Skaperen
  waiting for a mix of file and processes Skaperen 0 1,490 Jul-28-2019, 06:58 AM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020