Python Forum
How much multi-threading should be done?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How much multi-threading should be done?
#1
Question 1: Without using trial and error, is there a general rule to how deep multi-threading should be done?

Let's say I have 5 nested for loops, all of which loop 100 times.

Do I multi-thread the first loop? First and second? First three? All but last? All?

Question 2: Assume I am to multi-thread all loops, would I do the following?
for a, aVal in enumerate(sa_data[0]):
	thread.start_new_thread(func2, 0) #pass 0 as arg if my func requires no args?

def func2(neverUsed)
	for b, bVal in enumerate(sa_data[1]):
		thread.start_new_thread(func3, 0)

def func3(neverUsed)
	for c, cVal in enumerate(sa_data[2]):
		...
Is there possibly a way to do the same as the above, but without function calls?
Reply
#2
(Apr-17-2018, 12:00 PM)IAMK Wrote: Let's say I have 5 nested for loops, all of which loop 100 times.
None. 100 is a tiny number.

But basically, my rule of thumb is to avoid it at all costs unless it's absolutely 100% needed. And then, using a Process Pool is probably easier than spinning up dozens of processes/threads manually.
Reply
#3
The general use of multiprocessing is doing two things at once; have a running process and also be able to enter a key to kill it, or update some progress display.
Reply
#4
(Apr-17-2018, 04:13 PM)nilamo Wrote: 100 is a tiny number.
100^5 is a tiny number?

(Apr-17-2018, 04:59 PM)woooee Wrote: have a running process and also be able to enter a key to kill it
Could you please give me an example of where such a thing would be needed? I was mainly thinking about multi-threading for data processing.
Reply
#5
Multiprocessing is more expensive because more instructions need to start a process. Switching between processes is more expensive too. It's reasonable to use multiprocessing when there is cpu computation involved. For io operations, threads are a better choice. As I am aware of all of this. Could be wrong. Blush
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#6
(Apr-17-2018, 09:30 PM)wavic Wrote: For io operations, threads are a better choice.
I am doing many checks on 25billion combinations and then writing them to files.

Also, I cannot get the thread to work. Do you know how to make the arg work (my function doesn't really need args, so I'm passing the enumerator to make it unique)?

Loop right before threading:
for a, aVal in enumerate(sa_data[0]):

Passing int:
_thread.start_new_thread(reel2, a)
Error: 2nd arg must be a tuple

Passing int casted to tuple:
_thread.start_new_thread(reel2, tuple(a))
Error: 'int' object is not an iterable

As per https://www.tutorialspoint.com/python/python_tuples.htm
A tuple is a sequence, so...

Passing enumeration pair to tuple:
_thread.start_new_thread(reel2, (a, aVal))
I noticed I could also do _thread.start_new_thread(reel2, (a,))?
Error: Fatal Python error: could not acquire lock for <_io.BufferedWriter name='<stderr>'> at interpreter shutdown, possibly due to daemon threads

I'm not sure how to put a lock right before the filewrite inside reel2() then release it.
Reply
#7
You probably shouldn't be using the _thread module. threading.Thread(target=your_function, args=(some, tuple)).start() is probably where you want to end up... if you want to use threads.

Note, though, that threads in python aren't "real" threads. They won't utilize multiple cores on your processor, and will probably not speed anything up in this case.

If this is the direction you want to head, then you should probably look at multiprocessing.Pool: https://docs.python.org/3/library/multip....pool.Pool
Reply
#8
The main "issue" with the code are those nested for loops.
If that could be avoided...
SciPy module should handle this more effectively I but can't tell for sure. I didn't need it and I don't know it at all.
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020