Python Forum
Question about the groupby function
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Question about the groupby function
#1
Hi, I have the following code:

import itertools
first_letter = lambda x: x[0]
names = ['Alan', 'Adam', 'Wes', 'Will', 'Albert', 'Steven']

for letter, names in itertools.groupby(names, first_letter):
   print(letter, list(names))
The program returned:

A ['Alan', 'Adam']
W ['Wes', 'Will']
A ['Albert']
S ['Steven']

Anybody knows why it does not return the result like the following?

A ['Alan', 'Adam', 'Albert']
W ['Wes', 'Will']
S ['Steven']
Reply
#2
list needs to be sorted
use:
>>> names = ['Alan', 'Adam', 'Wes', 'Will', 'Albert', 'Steven']
>>> names.sort()
>>> for letter, inames in itertools.groupby(names, first_letter):
...     print(letter, list(inames))
... 
A ['Adam', 'Alan', 'Albert']
S ['Steven']
W ['Wes', 'Will']
>>>
also, you were overwriting names
result:
Reply
#3
(Feb-08-2020, 04:39 AM)new_to_python Wrote: Anybody knows why it does not return the result like the following?

A ['Alan', 'Adam', 'Albert']
W ['Wes', 'Will']
S ['Steven']

In Python interactive interpreter type >>> help(itertools.groupby) (note the part which says : returns consecutive keys and groups from the iterable).


>>> help(itertools.groupby)
class groupby(builtins.object)
 |  groupby(iterable, key=None)
 |  
 |  make an iterator that returns consecutive keys and groups from the iterable
 |  
 |  iterable
 |    Elements to divide into groups according to the key function.
 |  key
 |    A function for computing the group category for each element.
 |    If the key function is not specified or is None, the element itself
 |    is used for grouping.
/.../
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#4
Thanks. Yes, I read that. Is the key here "consecutive"? Am I correct that in this example, because 'Albert' and ['Alan', 'Adam'] are separated by ['Wes', 'Will'], it is not "consecutive"/following ['Alan', 'Adam']. As a result, it has its own group?

(Feb-08-2020, 06:06 AM)Larz60+ Wrote: list needs to be sorted
use:
>>> names = ['Alan', 'Adam', 'Wes', 'Will', 'Albert', 'Steven']
>>> names.sort()
>>> for letter, inames in itertools.groupby(names, first_letter):
...     print(letter, list(inames))
... 
A ['Adam', 'Alan', 'Albert']
S ['Steven']
W ['Wes', 'Will']
>>>
also, you were overwriting names
result:

Thanks. Is it always better to use a variable name different from the first element of groupby to avoid overwriting it?

What is following after "result:"?
Reply
#5
reusing names for different objects is dangerous, but allowed.
It's like you had 10 kids all named Pete, some boys and some girls!
Reply
#6
(Feb-08-2020, 10:50 PM)Larz60+ Wrote: reusing names for different objects is dangerous, but allowed.
It's like you had 10 kids all named Pete, some boys and some girls!

I will keep that in mind. Thanks.

So am I correct that because 'Albert' and ['Alan', 'Adam'] are separated by ['Wes', 'Will'], 'Albert' is not consecutive to ['Alan', 'Adam'] and as a result, it forms its own group?
Reply
#7
yes, that's why i added the sort
Reply
#8
(Feb-09-2020, 04:14 AM)new_to_python Wrote: So am I correct that because 'Albert' and ['Alan', 'Adam'] are separated by ['Wes', 'Will'], 'Albert' is not consecutive to ['Alan', 'Adam'] and as a result, it forms its own group?

Python interactive interpreter is excellent tool for observing how stuff 'works':

>>> s = 'aabbc'                                                                                                         
>>> itertools.groupby(s)                                                                                                
<itertools.groupby at 0x1187f9db0>              # groupby object, not very helpful
>>> list(itertools.groupby(s))                  # lets peek inside                                                                            
[('a', <itertools._grouper at 0x118816ac8>),    # groupby object is stream of tuples where:
 ('b', <itertools._grouper at 0x1188168d0>),        - first element is group name
 ('c', <itertools._grouper at 0x118816940>)]        - second element is group itself as grouper object
>>> for key, group in itertools.groupby(s):    # Let's unpack it into human readable format
...     print(f'Group name: {key}, group: {[*group]}')
...
Group name: a, group: ['a', 'a']
Group name: b, group: ['b', 'b']
Group name: c, group: ['c'] 
This is basic functionality which might seen not very helpful. But groupby supports key function which enables to do lot of interesting stuff. Some examples below.

Filter out numbers from user input/string:

>>> user_input = ' a34+ *2'
>>> for key, group in itertools.groupby(user_input, lambda char: char.isdigit()):  # group based on type, key is bool i.e. True or False
...     if key:                                                                    # if group is True
...         print(int(''.join(group))                                              # construct integer from list of strings which are digits
...
34
2
# as list comprehension one-liner:
>>> [int(''.join(group)) for key, group in itertools.groupby(user_input, lambda char: char.isdigit()) if key]           
[34, 2]
Split on many splitters:

>>> text = 'abcdefghijklm'
>>> splitters = ['b','f','j']               # split text on these splitters
>>> list(''.join(group) for key, group in itertools.groupby(text, lambda split: split not in splitters) if key)         
['a', 'cde', 'ghi', 'klm']    
Combining with other itertools functions more 'interesting' code can be written.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Question on dir() function Soorya25 1 1,109 Jan-16-2023, 09:33 PM
Last Post: deanhystad
  Function not scriptable: Noob question kaega2 3 1,136 Aug-21-2022, 04:37 PM
Last Post: kaega2
  input function question barryjo 12 2,637 Jan-18-2022, 12:11 AM
Last Post: barryjo
  Use of groupby in a function with Pandas Paulman 0 929 Dec-03-2021, 04:56 PM
Last Post: Paulman
  Question on None function in a machine learning algorithm Livingstone1337 1 2,332 Mar-17-2021, 10:12 PM
Last Post: supuflounder
  Question in python function problem saratha 1 1,428 Jul-08-2020, 04:56 PM
Last Post: jefsummers
  question about python3 print function jamie_01 5 2,599 May-25-2020, 09:58 AM
Last Post: pyzyx3qwerty
  Question about list and while function doug2019 6 2,681 Oct-12-2019, 03:07 AM
Last Post: doug2019
  Question on Join() function sduvvuri 2 2,716 Jun-02-2019, 03:55 PM
Last Post: perfringo
  groupby and window function lravikumarvsp 3 3,088 May-13-2018, 06:24 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020