Python Forum
multiprocess hang when certain number is used in the program
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
multiprocess hang when certain number is used in the program
#1
Hi, I am a python beginner, I am starting to learn multiprocessing. I have created this program to simply calculate the squares of a big list of numbers.

The program takes a start_number and a end_number, separates the numbers into groups, then uses multiprocessing to calculate the results.

Each process puts the results in a List, and form a dictionary with their sequence number as key, then put into a Queue.

At the end, the program then combine the results as a list and print out the result list.



The program works fine with the start_number = 1, and end_number = any number between 1 to 12_998.

However, the program does not work when end_number = 12,999. It may work with certain end_number and not others passing 12_999. For example it works with 13_000 thru 13_005 but not 13_006 and not 13_007. It works with 13_008 thru 13_012 but not 13_013 ..... and so on.

I have tried running the same program in different computers with different CPU counts, and both windows and linux. The results are the same. I am using python 3.6.9

I have learned later that multiprocess.Pool is easier to use in this scenario. But I am interested to know what was wrong with my multiprocess.Process program, if anyone is kind enough to have a look.

Thank you.

-----------------
My codes:
-----------------
# This program hang when end number = 12_999 , 13_006, 13_007, 13_013.. etc
from multiprocessing import Process, Queue
from multiprocessing import log_to_stderr, get_logger


# divide the range of numbers into groups. so that each group can be processed with different Process. 
def dividegroup(minno, maxno, nof_groups):
    
    total_range = maxno - minno + 1
    remain = total_range % nof_groups
    group_range = total_range // nof_groups

    print(total_range, group_range, remain)

    lof_groups = []
    
    for i in range(nof_groups):
        lof_groups.append( range( minno + (i * group_range), minno + ((i + 1) * group_range )))

    if remain != 0 : lof_groups.append(range(minno + ((i + 1) * group_range), maxno + 1))

    return lof_groups


# square the numbers and put in Queque
def square(seq, numbers, q):

    answers = [x * x for x in numbers]    
    results = {seq: answers} 
    q.put(results, block = False)

# main program
def main():

    log_to_stderr()
    logger = get_logger()
    logger.setLevel(20)

    print('\033c')
    start_number = 1
    end_number = 12_999
    number_of_groups = 8

    list_of_groups = dividegroup(start_number, end_number, number_of_groups)
    print(list(list_of_groups))

    q = Queue(maxsize=0)
    process_seq = 0
    processes = []
 
    for i in list_of_groups:
        process_seq += 1
        process = Process(target = square, args = (process_seq, i, q))
        processes.append(process)

    for process_s in processes:
        process_s.start()

    for process_j in processes:
        process_j.join()

    result_dic = {}
    while not q.empty():
        result_dic.update(q.get())       

    result_list = []
    keylist = list(result_dic.keys())
    keylist.sort()
    for i in keylist:
        result_list += result_dic.get(i)
    print(result_list)


if __name__ == '__main__':
    main()
----------------------------------------------------
The Result when end_number = 12_999 is used.
----------------------------------------------------
Output:
12999 1624 7 [range(1, 1625), range(1625, 3249), range(3249, 4873), range(4873, 6497), range(6497, 8121), range(8121, 9745), range(9745, 11369), range(11369, 12993), range(12993, 13000)] [INFO/Process-3] child process calling self.run() [INFO/Process-1] child process calling self.run() [INFO/Process-3] process shutting down [INFO/Process-2] child process calling self.run() [INFO/Process-4] child process calling self.run() [INFO/Process-3] process exiting with exitcode 0 [INFO/Process-7] child process calling self.run() [INFO/Process-1] process shutting down [INFO/Process-1] process exiting with exitcode 0 [INFO/Process-8] child process calling self.run() [INFO/Process-9] child process calling self.run() [INFO/Process-6] child process calling self.run() [INFO/Process-4] process shutting down [INFO/Process-8] process shutting down [INFO/Process-4] process exiting with exitcode 0 [INFO/Process-5] child process calling self.run() [INFO/Process-9] process shutting down [INFO/Process-7] process shutting down [INFO/Process-2] process shutting down [INFO/Process-5] process shutting down [INFO/Process-6] process shutting down [INFO/Process-9] process exiting with exitcode 0 [INFO/Process-2] process exiting with exitcode 0 [INFO/Process-8] process exiting with exitcode 0 [INFO/Process-7] process exiting with exitcode 0 [INFO/Process-6] process exiting with exitcode 0
Reply
#2
I am new to the forum.
Reply
#3
I find it hangs for most ranges and seldom works.
Reply
#4
Hi Deanhystad,

1) I have a 8 cores CPU. I was trying to see how number of processes in relation to number of CPU, influence the speed.
2) For other numbers, it will print a list of results. For example end_number = 20, the output is as follows.
With end_number = 12_999, the program hangs, it will not print the result and "[MainProcess] process shutting down" was not performed.
With end_number = 13_000, the program works. But print out will be very long, therefore, I did not put an example here.

-------------------------------------
Example when end_number = 20
-------------------------------------
Output:
20 2 4 [range(1, 3), range(3, 5), range(5, 7), range(7, 9), range(9, 11), range(11, 13), range(13, 15), range(15, 17), range(17, 21)] [INFO/Process-1] child process calling self.run() [INFO/Process-1] process shutting down [INFO/Process-2] child process calling self.run() [INFO/Process-1] process exiting with exitcode 0 [INFO/Process-3] child process calling self.run() [INFO/Process-3] process shutting down [INFO/Process-3] process exiting with exitcode 0 [INFO/Process-2] process shutting down [INFO/Process-2] process exiting with exitcode 0 [INFO/Process-6] child process calling self.run() [INFO/Process-8] child process calling self.run() [INFO/Process-8] process shutting down [INFO/Process-5] child process calling self.run() [INFO/Process-7] child process calling self.run() [INFO/Process-8] process exiting with exitcode 0 [INFO/Process-5] process shutting down [INFO/Process-5] process exiting with exitcode 0 [INFO/Process-4] child process calling self.run() [INFO/Process-7] process shutting down [INFO/Process-4] process shutting down [INFO/Process-6] process shutting down [INFO/Process-7] process exiting with exitcode 0 [INFO/Process-6] process exiting with exitcode 0 [INFO/Process-4] process exiting with exitcode 0 [INFO/Process-9] child process calling self.run() [INFO/Process-9] process shutting down [INFO/Process-9] process exiting with exitcode 0 [1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400] [INFO/MainProcess] process shutting down
Reply
#5
(Oct-27-2020, 10:05 AM)deanhystad Wrote: I find it hangs for most ranges and seldom works.

It works from 1 to 12_998, starts giving problem after 12_999.
Reply
#6
The problem appears to be with putting large results in the queue. I tried running with 1 process and it works until square.answers becomes large. I can do any number of square calculations as long as I only add a few numbers to the queue. This is probably why it seldom works for me. One of the first things I tried was reduce the number of processes which increased the size of each individual results.
Reply
#7
(Oct-27-2020, 10:54 AM)deanhystad Wrote: The problem appears to be with putting large results in the queue. I tried running with 1 process and it works until square.answers becomes large. I can do any number of square calculations as long as I only add a few numbers to the queue. This is probably why it seldom works for me. One of the first things I tried was reduce the number of processes which increased the size of each individual results.

1) I found varying the number of groups (processes) does alter the end_number that result in a hang.
when I reduce the number_of_groups = 2 (line 42), the program hang with end_number = 11675

2) I found similar problem does not develop if I change the operation from x * x to x + x. (line 28 of the program)

that do indicate that problem was not due to the number of processes.

btw: I have alter the program, so that it will loop the end_number from 1 to a specific number,
ie, I change the main() to a function and made a new main that will run the old main in loop.
so that I can run through numbers until it fails. That is how I found 11675.

if it is helpful, I can insert the new program here.
Reply
#8
Thanks deanhystad.
Does anyone else has any idea? I still have no clue what happened after scratching my head for few days.
Thanks in advance.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Smile Help making number analysis program Dainer 2 1,769 Jun-24-2021, 09:55 PM
Last Post: jefsummers
  prometheus in multiprocess code georgelza 4 3,994 Jan-21-2020, 05:13 PM
Last Post: georgelza
  how to make a program with a certain number of multiples? syafiq14 3 2,786 Jan-01-2020, 02:39 PM
Last Post: syafiq14
  Multiprocess not writing to file DreamingInsanity 4 7,967 Dec-07-2019, 03:10 PM
Last Post: DreamingInsanity
  help with multiprocess concept kiyoshi7 2 2,489 Aug-10-2019, 08:19 PM
Last Post: kiyoshi7
  Creating a program to look for the largest prime number of a number Wikki14 4 3,918 Sep-08-2018, 12:30 AM
Last Post: Skaperen
  Hang man game supermane 2 2,251 Aug-15-2018, 12:07 PM
Last Post: Larz60+
  Why does this hang the system up Able98 15 10,059 Sep-02-2017, 09:50 PM
Last Post: nilamo
  example of multiprocess piping Skaperen 4 6,045 Dec-02-2016, 12:55 PM
Last Post: Larz60+
  multiprocess passing multiple arguments double asterisk pic8690 1 5,291 Oct-23-2016, 08:51 AM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020