Python Forum
Multiprocess not writing to file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Multiprocess not writing to file
#1
I am trying to improve the speed some of my code with Multiprocessing. Here is my code:
def create_averaging_process(processes, image_file):
    p = Process(target=get_averages, args=(image_file,))
    processes.append(p)
    p.start()
    p.join()

def make_threads():
    processes = []
    index = 0
    allfiles = glob.glob(path+"/images/*")

    while (len(processes)<len(allfiles)):
        if (len(processes) - len([p for p in processes if not p.is_alive()]) < MAX_PROCESSES):
            create_averaging_process(processes, allfiles[index])
            index+=1
    print("done")
    
def get_averages(image):
    with open(".img_averages.txt", 'a') as file:
        try:
            img = Image.open(image).convert('RGB')
        except Exception as e:
            print(e)
        print(img)
        img2 = img.resize((1, 1), Image.ANTIALIAS) #easiest way of getting average color - resize to 1x1 with anti-alias                                     
        col = img2.getpixel((0, 0)) #get color of that single pixel

        try:
            something = col[2] #sometimes there are just singular ints
            print(col)
            file.write(str(col)) #writes color and size to file. Uses '#' \to separate them
        except Exception as e:
            print(e)

make_threads()
Everything runs fine - it even prints out the colour of the pixels. However, nothing is written to the file. No matter how many times I try, the file stays at 0 bytes.
I am running this in terminal not IDLE.

What is the problem?
Reply
#2
Firstly, files are not threadsafe so if any of the processes are writing to the file, data would possibly be lost from another process also writing to the file at the same time. There may also be a problem with the multiprocessing module because data is not shared across processors without specific threadsafe objects; refer to Geeks For Geeks - Multiprocessing in Python Set 2.

create_averaging_process should be denying you the benefit of multiprocessing/multithreading. Process.join() halts execution until the process is completed so the function should not terminate until it's complete.
Reply
#3
(Dec-05-2019, 11:18 PM)stullis Wrote: Process.join() halts execution until the process is completed so the function should not terminate until it's complete.
I should be able to delete that. I only added it in because some of the processes would complete after the 'done' message was printed.

(Dec-05-2019, 11:18 PM)stullis Wrote: Firstly, files are not threadsafe so if any of the processes are writing to the file, data would possibly be lost from another process also writing to the file at the same time. There may also be a problem with the multiprocessing module because data is not shared across processors without specific threadsafe objects; refer to Geeks For Geeks - Multiprocessing in Python Set 2.
According to this website:
Quote:Indeed, only one data structure is guaranteed to be thread safeā€”the Queue class in the multiprocessing module.
I did some googling on it last night and most of the answers used the multiprocessing queue class.
Should I use this?
Reply
#4
I've got it working with multiprocessing's queue class:
def create_averaging_process(processes, image_file, q):
    p = Process(target=get_averages, args=(image_file, q))
    processes.append(p)
    p.start()

def write_to_file(file, q, end):
    with open(file, 'a') as f:
        while True: 
            line = q.get()
            if line == end:
                return
            f.write(str(line))
            

def get_averages(image, q):
    try:
        img = Image.open(image).convert('RGB')
    except Exception as e:
        print(e)
    img2 = img.resize((1, 1), Image.ANTIALIAS) #easiest way of getting average color - resize to 1x1 with anti-alias                                     
    col = img2.getpixel((0, 0)) #get color of that single pixel
    try:
        something = col[2] #sometimes there are just singular ints
        q.put(col)
    except Exception as e:
        print(e)


if __name__ == "__main__":
    index = 0
    processes = []
    queue = multiprocessing.Queue()

    STOP_TOKEN="end"
    
    allfiles = glob.glob(path+"/images/*")

    while (len(processes)<len(allfiles)):
        if (len(processes) - len([p for p in processes if not p.is_alive()]) < MAX_PROCESSES):
            create_averaging_process(processes, allfiles[index], queue)
            index+=1
   
    writer_process = multiprocessing.Process(target = write_to_file, args=(path+"/test.txt", queue, STOP_TOKEN))
    writer_process.start()
    
    queue.put(STOP_TOKEN)
    writer_process.join()
Reply
#5
I'm now having trouble using multiprocessing on another part.

This part of my code creates multiple images. I was going to make it more efficient by having to create multiple at once, so the process is done quicker. I though it would be as simple as just integrating the code I used before, but it doesn't work.

This time, rather than writing to a file, I am creating multiple images, all of which are large numpy arrays. I pass these arrays into the queue, so now it is holding the data of the images. The next step would be to read the queue, and save all of the images, however, it seems the processe(s) hang when I pass these arrays.
I printed some debug lines and this is the output:
Output:
create_making_process processes [<Process(Process-2, started)>] while loop create_making_process processes [<Process(Process-2, started)>, <Process(Process-3, started)>] while loop out of loop
This is the correct output, however, after the last line 'out of loop', I would expect to see a numpy array being printed (because I call queue.get()) but nothing happens and it just hangs.
It seems when I call KeyboardInterrupt, sometimes this is the line it is hanging on:
lock.acquire(block, timeout) but sometimes it seems to be n = write(self.??, buf) (I can't remember the full line, and I couldn't get it to appear).

This is the code that creates the processes (it literally what I had last time):
n_frames = Image.open(options['to_create']).n_frames #gets amount of frames in gif

queue = multiprocessing.Queue()
while (len(processes) < n_frames): #checks for current processes alive being less than all the files that need to be worked on
    if (len(processes) - len([p for p in processes if not p.is_alive()]) < 3): #if the current amount of live processes is less than 3 - limiting at 3 no matter what because it uses a lot of ram
        
        def create_making_process(processes, frame, q): #create some more processes
            print("create_making_process")
            p = multiprocessing.Process(target = MakeImage.create_image_gif, args=(frame, q)) #create a new process
            processes.append(p) #add it to array
            p.start() #start it
            
        create_making_process(processes, index, queue) 
        index+=1
        print("processes", processes)
    print("while loop")
print("out of loop")
In create_image_gif, I work on the image and then add it to queue as a numpy array like this:
q.put([new_img])
What is causing this problem?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Writing string to file results in one character per line RB76SFJPsJJDu3bMnwYM 4 1,309 Sep-27-2022, 01:38 PM
Last Post: buran
  Writing to json file ebolisa 1 970 Jul-17-2022, 04:51 PM
Last Post: deanhystad
  Writing to External File DaveG 9 2,411 Mar-30-2022, 06:25 AM
Last Post: bowlofred
  Writing to file ends incorrectly project_science 4 2,642 Jan-06-2021, 06:39 PM
Last Post: bowlofred
  Writing unit test results into a text file ateestructural 3 4,654 Nov-15-2020, 05:41 PM
Last Post: ateestructural
  Writing to file in a specific folder evapa8f 5 3,337 Nov-13-2020, 10:10 PM
Last Post: deanhystad
  multiprocess hang when certain number is used in the program esphi 7 3,115 Nov-06-2020, 03:49 PM
Last Post: esphi
  Failure in writing binary text to file Gigux 7 3,718 Jul-04-2020, 08:41 AM
Last Post: Gigux
  writing data to a csv-file apollo 1 2,329 Jul-03-2020, 02:28 PM
Last Post: DeaD_EyE
  Writing to File Issue Flash_Stang 3 2,477 Jun-05-2020, 05:14 AM
Last Post: Gribouillis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020