![]() |
multiprocess hang when certain number is used in the program - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: multiprocess hang when certain number is used in the program (/thread-30587.html) |
multiprocess hang when certain number is used in the program - esphi - Oct-27-2020 Hi, I am a python beginner, I am starting to learn multiprocessing. I have created this program to simply calculate the squares of a big list of numbers. The program takes a start_number and a end_number, separates the numbers into groups, then uses multiprocessing to calculate the results. Each process puts the results in a List, and form a dictionary with their sequence number as key, then put into a Queue. At the end, the program then combine the results as a list and print out the result list. The program works fine with the start_number = 1, and end_number = any number between 1 to 12_998. However, the program does not work when end_number = 12,999. It may work with certain end_number and not others passing 12_999. For example it works with 13_000 thru 13_005 but not 13_006 and not 13_007. It works with 13_008 thru 13_012 but not 13_013 ..... and so on. I have tried running the same program in different computers with different CPU counts, and both windows and linux. The results are the same. I am using python 3.6.9 I have learned later that multiprocess.Pool is easier to use in this scenario. But I am interested to know what was wrong with my multiprocess.Process program, if anyone is kind enough to have a look. Thank you. ----------------- My codes: ----------------- # This program hang when end number = 12_999 , 13_006, 13_007, 13_013.. etc from multiprocessing import Process, Queue from multiprocessing import log_to_stderr, get_logger # divide the range of numbers into groups. so that each group can be processed with different Process. def dividegroup(minno, maxno, nof_groups): total_range = maxno - minno + 1 remain = total_range % nof_groups group_range = total_range // nof_groups print(total_range, group_range, remain) lof_groups = [] for i in range(nof_groups): lof_groups.append( range( minno + (i * group_range), minno + ((i + 1) * group_range ))) if remain != 0 : lof_groups.append(range(minno + ((i + 1) * group_range), maxno + 1)) return lof_groups # square the numbers and put in Queque def square(seq, numbers, q): answers = [x * x for x in numbers] results = {seq: answers} q.put(results, block = False) # main program def main(): log_to_stderr() logger = get_logger() logger.setLevel(20) print('\033c') start_number = 1 end_number = 12_999 number_of_groups = 8 list_of_groups = dividegroup(start_number, end_number, number_of_groups) print(list(list_of_groups)) q = Queue(maxsize=0) process_seq = 0 processes = [] for i in list_of_groups: process_seq += 1 process = Process(target = square, args = (process_seq, i, q)) processes.append(process) for process_s in processes: process_s.start() for process_j in processes: process_j.join() result_dic = {} while not q.empty(): result_dic.update(q.get()) result_list = [] keylist = list(result_dic.keys()) keylist.sort() for i in keylist: result_list += result_dic.get(i) print(result_list) if __name__ == '__main__': main()---------------------------------------------------- The Result when end_number = 12_999 is used. ----------------------------------------------------
RE: multiprocess hang when certain number is used in the program - esphi - Oct-27-2020 I am new to the forum. RE: multiprocess hang when certain number is used in the program - deanhystad - Oct-27-2020 I find it hangs for most ranges and seldom works. RE: multiprocess hang when certain number is used in the program - esphi - Oct-27-2020 Hi Deanhystad, 1) I have a 8 cores CPU. I was trying to see how number of processes in relation to number of CPU, influence the speed. 2) For other numbers, it will print a list of results. For example end_number = 20, the output is as follows. With end_number = 12_999, the program hangs, it will not print the result and "[MainProcess] process shutting down" was not performed. With end_number = 13_000, the program works. But print out will be very long, therefore, I did not put an example here. ------------------------------------- Example when end_number = 20 -------------------------------------
RE: multiprocess hang when certain number is used in the program - esphi - Oct-27-2020 (Oct-27-2020, 10:05 AM)deanhystad Wrote: I find it hangs for most ranges and seldom works. It works from 1 to 12_998, starts giving problem after 12_999. RE: multiprocess hang when certain number is used in the program - deanhystad - Oct-27-2020 The problem appears to be with putting large results in the queue. I tried running with 1 process and it works until square.answers becomes large. I can do any number of square calculations as long as I only add a few numbers to the queue. This is probably why it seldom works for me. One of the first things I tried was reduce the number of processes which increased the size of each individual results. RE: multiprocess hang when certain number is used in the program - esphi - Oct-27-2020 (Oct-27-2020, 10:54 AM)deanhystad Wrote: The problem appears to be with putting large results in the queue. I tried running with 1 process and it works until square.answers becomes large. I can do any number of square calculations as long as I only add a few numbers to the queue. This is probably why it seldom works for me. One of the first things I tried was reduce the number of processes which increased the size of each individual results. 1) I found varying the number of groups (processes) does alter the end_number that result in a hang. when I reduce the number_of_groups = 2 (line 42), the program hang with end_number = 11675 2) I found similar problem does not develop if I change the operation from x * x to x + x. (line 28 of the program) that do indicate that problem was not due to the number of processes. btw: I have alter the program, so that it will loop the end_number from 1 to a specific number, ie, I change the main() to a function and made a new main that will run the old main in loop. so that I can run through numbers until it fails. That is how I found 11675. if it is helpful, I can insert the new program here. RE: multiprocess hang when certain number is used in the program - esphi - Nov-06-2020 Thanks deanhystad. Does anyone else has any idea? I still have no clue what happened after scratching my head for few days. Thanks in advance. |