Python Forum
groups attribute of a groupby object question
Thread Rating:
  • 2 Vote(s) - 3.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
groups attribute of a groupby object question
#1
I am having trouble understanding the following code. Note, auto is a DataFrame that contains the variables yr (year) and mpg (miles per gallon):

splitting=auto.groupby(‘yr’)
type(splitting.groups)
for group_name, group in splitting:
    avg=group[‘mpg’].mean()
    print(group_name,avg)
Here is what I understand: we are saving a groupby object to "splitting" that is grouped by year. Next, we see that the type of splitting.groups is a dictionary. We iterate over the key value pairs in splitting, obtain an average, and print the key along with it's average mpg.

What is confusing me is the line "avg=group['mpg'].mean()." From what I've learned about indexing, I've never seen a value being able to be used at the beginning of the index. Since group is the value in the key value pairs, how does Python know that group['mpg'] refers to the mpg column?
Reply
#2
Your group_name and group is not a key: value pair from grouping.groups. Iterating over dataframe's groupby yields unique values of grouping variable and appropriate subset of the original dataframe. In your case group_name is yr value and group is a subset of the original auto dataframe, basically auto[auto.yr==group_name]. So group['mpg'] is column with mpg values (again, its basically part of auto['mpg'] column).

You could get similar result without groupby with:
for yr in auto.yr.unique():  # yr is like your group_name
    group = auto[auto.yr==yr]
    avg = group['mpg'].mean()
    print(yr, avg)
But usually you would use just auto.groupby('yr')['mpg'].mean() without any explicit looping.
Reply
#3
(Apr-26-2017, 08:33 PM)zivoni Wrote: Your group_name and group is not a key: value pair from grouping.groups. Iterating over dataframe's groupby yields unique values of grouping variable and appropriate subset of the original dataframe. In your case group_name is yr value and group is a subset of the original auto dataframe, basically auto[auto.yr==group_name]. So group['mpg'] is column with mpg values (again, its basically part of auto['mpg'] column).

That makes a lot of sense. Took me a few times to read over to understand but I got it all now lol. Thanks for the thorough explanation.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  type object 'man' has no attribute 'centerX' Tempo 7 854 Mar-07-2025, 03:47 AM
Last Post: deanhystad
  Random student selection from groups. esahan 7 1,740 Jul-08-2024, 12:28 AM
Last Post: AdamHensley
  How to group related products in relationship groups? RegionHUser 2 949 Jun-02-2024, 03:51 PM
Last Post: Pedroski55
  AttributeError: '_tkinter.tkapp' object has no attribute 'username' Konstantin23 4 5,991 Aug-04-2023, 12:41 PM
Last Post: Konstantin23
  Python: AttributeError: 'PageObject' object has no attribute 'extract_images' Melcu54 2 7,199 Jun-18-2023, 07:47 PM
Last Post: Melcu54
  Object attribute behavior different in 2 scripts db042190 1 1,861 Jun-14-2023, 12:37 PM
Last Post: deanhystad
  Initiating an attribute in a class __init__: question billykid999 8 3,035 May-02-2023, 09:09 PM
Last Post: billykid999
  cx_oracle Error - AttributeError: 'function' object has no attribute 'cursor' birajdarmm 1 4,818 Apr-15-2023, 05:17 PM
Last Post: deanhystad
  Pandas AttributeError: 'DataFrame' object has no attribute 'concat' Sameer33 5 10,387 Feb-17-2023, 06:01 PM
Last Post: Sameer33
  WebDriver' object has no attribute 'find_element_by_css_selector rickadams 3 7,913 Sep-19-2022, 06:11 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020