Python Forum
How to use multiprocessing with an array that is continually being appended
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to use multiprocessing with an array that is continually being appended
#1
I have a (long!) script that I wish to speed up considerably by splitting it into 3 modules, and running them in parallel, rather than sequentially, which they currently do. Each script will need to read from several arrays which are constantly being appended, and I'm a little bit confused as to which multiprocessing method is best for that, as each script will potentially be reading the information at different times (ie, process a could be writing value 100 in the array, whilst process b is currently reading 50, and process c 30).

Would a pipe have to send every value in the array I want to pass at once, or can I append it? If it helps, the information only has to travel one way.
Reply
#2
I think a Queue is what you're looking for. One process can add things to it, another can take them out. If each of your tasks/processes has an "in" and an "out" queue, you can chain them together to work on data together.

https://docs.python.org/3/library/multip...sing.Queue
Reply
#3
Perfect, thank you. Do you know if you can have more than one queue per process? (Ie in my case, rgb values, plus x, y coordinates)

Edit: Or, alternatively, can I use a queue to send a numpy array?
Reply
#4
Queues can hold any serializable python object (which is pretty much anything, except maybe not an open file handle or socket or something like that). And there's no limit to the number of arguments you can send to a process.
Reply
#5
You can also use a Manager list or dictionary https://pymotw.com/3/multiprocessing/com...ared-state
Reply
#6
So is put essentially the same as appending? (ish)
Reply
#7
Yes. Instead of appending, you put() things into the queue, and then get() to remove an item from the queue. get() will block the thread until there's something to actually get out of the queue.

The JoinableQueue is also pretty useful, as it allows you to just call .join() on the queue when you're done adding things to it, and it'll block until the queue's been fully processed. https://docs.python.org/3/library/multip...nableQueue

Something like:
from multiprocessing import JoinableQueue as JQueue

# imagine this is the thread
def processor(in_queue):
    while True:
        item = in_queue.get()
        process(item)  # or whatever you do with it
        # now that we're done processing this item, let the queue know we've finished with this item
        in_queue.task_done()

queue = JQueue()
# create the thread here, passing (queue, ) as args so the thread has access to the queue

# now we can add things to the queue, and the thread will process them
queue.put("test thing")
queue.put("something else")

# then we just wait for the thread to finish processing everything we sent it
queue.join()

# at this point, we're in the main thread, the queue is empty, and everything that was in it has been fully processed
Reply
#8
Excellent, thank you for your help.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020