Posts: 687
Threads: 37
Joined: Sep 2016
Jan-11-2017, 10:04 PM
So I have a list L . To do some parallel processing, I want to split it into N sublists of equivalent length. The contents of the sublists are indifferent. What is your shortest and/or most pythonic code? Can you avoid using len(L) explicitly?
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Posts: 4,220
Threads: 97
Joined: Sep 2016
Jan-12-2017, 01:26 AM
(This post was last modified: Jan-12-2017, 01:26 AM by ichabod801.)
M = [L[start::N] for start in range(N)]
Assumes N divides evenly into len(L). If that's a concern you could put a second index into the slice double zip it to account for that.
Posts: 591
Threads: 26
Joined: Sep 2016
Jan-12-2017, 02:41 AM
(This post was last modified: Jan-12-2017, 02:41 AM by Mekire.)
So you want to chunk?
Lots of good answers here:
http://stackoverflow.com/questions/31244...zed-chunks
The standard idiom people generally use is:
chunks = (L[i:i + N] for i in range(0, len(L), N)) This assumes the whole sequence fits in memory which it sounds like you are trying to avoid.
Alternatives use iter and zip and look like this:
chunks_zip = zip(*[iter(L)]*N) The second version above ignores groups that are not full. But can handle generators that wouldn't otherwise fit in memory:
Python3
L = range(10**25)
N = 3
chunks_zip = zip(*[iter(L)]*N)
for i in range(10):
print(next(chunks_zip)) Output: (0, 1, 2)
(3, 4, 5)
(6, 7, 8)
(9, 10, 11)
(12, 13, 14)
(15, 16, 17)
(18, 19, 20)
(21, 22, 23)
(24, 25, 26)
(27, 28, 29)
Use zip_longest with a fill value if you don't want to lose incomplete groups.
Posts: 687
Threads: 37
Joined: Sep 2016
Jan-12-2017, 08:10 AM
(This post was last modified: Jan-12-2017, 08:10 AM by Ofnuts.)
(Jan-12-2017, 01:26 AM)ichabod801 Wrote: M = [L[start::N] for start in range(N)]
Assumes N divides evenly into len(L). If that's a concern you could put a second index into the slice double zip it to account for that.
This is more or less what my code does, but with slightly clumsier Python. And as far as I can tell this works even if N is doesn't divide len(N).
(Jan-12-2017, 02:41 AM)Mekire Wrote: So you want to chunk?
Lots of good answers here:
http://stackoverflow.com/questions/31244...zed-chunks
The standard idiom people generally use is:
chunks = (L[i:i + N] for i in range(0, len(L), N)) This assumes the whole sequence fits in memory which it sounds like you are trying to avoid. No, in the real-life problem I am handling rather small lists, it just that each item ends up used in a webservice call so to speed things up I create N threads and give each a part of the original list.
(Jan-12-2017, 02:41 AM)Mekire Wrote: Alternatives use iter and zip and look like this:
chunks_zip = zip(*[iter(L)]*N) The second version above ignores groups that are not full. But can handle generators that wouldn't otherwise fit in memory:
Python3
L = range(10**25)
N = 3
chunks_zip = zip(*[iter(L)]*N)
for i in range(10):
print(next(chunks_zip)) Output: (0, 1, 2)
(3, 4, 5)
(6, 7, 8)
(9, 10, 11)
(12, 13, 14)
(15, 16, 17)
(18, 19, 20)
(21, 22, 23)
(24, 25, 26)
(27, 28, 29)
Use zip_longest with a fill value if you don't want to lose incomplete groups. Clever, but if you are unlucky, one of the chunks produced by the idioms above can have a length of 1 so the maximum size difference is N-1 . With the 1 every N sampling you get more uniform sizes (but possibly using more CPU/memory, which isn't really a concern for me but could be for someone else).
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Posts: 2,953
Threads: 48
Joined: Sep 2016
There is an example in the Python documentation: https://docs.python.org/3.5/library/iter...ls-recipes
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
Posts: 687
Threads: 37
Joined: Sep 2016
(Jan-12-2017, 09:13 AM)wavic Wrote: There is an example in the Python documentation: https://docs.python.org/3.5/library/iter...ls-recipes
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
Yes, this is Mekire's suggestion...
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Posts: 4,220
Threads: 97
Joined: Sep 2016
(Jan-12-2017, 08:10 AM)Ofnuts Wrote: This is more or less what my code does, but with slightly clumsier Python. And as far as I can tell this works even if N is doesn't divide len(N).
You said you wanted lists of equivalent length. I took that to mean equal length. If N does not divide evenly into len(L), then some of the lists will be one item longer. The fixes I proposed would make them all the same length, but they would drop some items.
Posts: 2,953
Threads: 48
Joined: Sep 2016
Jan-12-2017, 11:01 PM
(This post was last modified: Jan-12-2017, 11:08 PM by snippsat.)
How about this
It looks simple enough for everyone.
|