Python Forum
Code golfing: splitting a list - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Code golfing: splitting a list (/thread-1549.html)



Code golfing: splitting a list - Ofnuts - Jan-11-2017

So I have a list L. To do some parallel processing, I want to split it into N sublists of equivalent length. The contents of the sublists are indifferent. What is your shortest and/or most pythonic code? Can you avoid using len(L) explicitly? 

python python python


RE: Code golfing: splitting a list - ichabod801 - Jan-12-2017


Assumes N divides evenly into len(L). If that's a concern you could put a second index into the slice double zip it to account for that.


RE: Code golfing: splitting a list - Mekire - Jan-12-2017

So you want to chunk?
Lots of good answers here:
http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks

The standard idiom people generally use is:
chunks = (L[i:i + N] for i in range(0, len(L), N))
This assumes the whole sequence fits in memory which it sounds like you are trying to avoid.

Alternatives use iter and zip and look like this:
chunks_zip = zip(*[iter(L)]*N)
The second version above ignores groups that are not full.  But can handle generators that wouldn't otherwise fit in memory:
Python3
L = range(10**25)
N = 3

chunks_zip = zip(*[iter(L)]*N)

for i in range(10):
    print(next(chunks_zip))
Output:
(0, 1, 2) (3, 4, 5) (6, 7, 8) (9, 10, 11) (12, 13, 14) (15, 16, 17) (18, 19, 20) (21, 22, 23) (24, 25, 26) (27, 28, 29)
Use zip_longest with a fill value if you don't want to lose incomplete groups.


RE: Code golfing: splitting a list - Ofnuts - Jan-12-2017

(Jan-12-2017, 01:26 AM)ichabod801 Wrote:

Assumes N divides evenly into len(L). If that's a concern you could put a second index into the slice double zip it to account for that.

This is more or less what my code does, but with slightly clumsier Python. And as far as I can tell this works even if N is doesn't divide len(N).

(Jan-12-2017, 02:41 AM)Mekire Wrote: So you want to chunk?
Lots of good answers here:
http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks

The standard idiom people generally use is:
chunks = (L[i:i + N] for i in range(0, len(L), N))
This assumes the whole sequence fits in memory which it sounds like you are trying to avoid.
No, in the real-life problem I am handling rather small lists, it just that each item ends up used in a webservice call so to speed things up I create N threads and give each a part of the original list.

(Jan-12-2017, 02:41 AM)Mekire Wrote: Alternatives use iter and zip and look like this:
chunks_zip = zip(*[iter(L)]*N)
The second version above ignores groups that are not full.  But can handle generators that wouldn't otherwise fit in memory:
Python3
L = range(10**25)
N = 3

chunks_zip = zip(*[iter(L)]*N)

for i in range(10):
    print(next(chunks_zip))
Output:
(0, 1, 2) (3, 4, 5) (6, 7, 8) (9, 10, 11) (12, 13, 14) (15, 16, 17) (18, 19, 20) (21, 22, 23) (24, 25, 26) (27, 28, 29)
Use zip_longest with a fill value if you don't want to lose incomplete groups.
Clever, but if you are unlucky, one of the chunks produced by the idioms above can have a length of 1 so the maximum size difference is N-1. With the 1 every N sampling you get more uniform sizes (but possibly using more CPU/memory, which isn't really a concern for me but could be for someone else).


RE: Code golfing: splitting a list - wavic - Jan-12-2017

There is an example in the Python documentation: https://docs.python.org/3.5/library/itertools.html#itertools-recipes

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)



RE: Code golfing: splitting a list - Ofnuts - Jan-12-2017

(Jan-12-2017, 09:13 AM)wavic Wrote: There is an example in the Python documentation: https://docs.python.org/3.5/library/itertools.html#itertools-recipes

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

Yes, this is Mekire's suggestion...


RE: Code golfing: splitting a list - ichabod801 - Jan-12-2017

(Jan-12-2017, 08:10 AM)Ofnuts Wrote: This is more or less what my code does, but with slightly clumsier Python. And as far as I can tell this works even if N is doesn't divide len(N).

You said you wanted lists of equivalent length. I took that to mean equal length. If N does not divide evenly into len(L), then some of the lists will be one item longer. The fixes I proposed would make them all the same length, but they would drop some items.


RE: Code golfing: splitting a list - wavic - Jan-12-2017

How about this

It looks simple enough for everyone.