Python Forum
groups attribute of a groupby object question
Thread Rating:
  • 2 Vote(s) - 3.5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
groups attribute of a groupby object question
#1
I am having trouble understanding the following code. Note, auto is a DataFrame that contains the variables yr (year) and mpg (miles per gallon):

splitting=auto.groupby(‘yr’)
type(splitting.groups)
for group_name, group in splitting:
    avg=group[‘mpg’].mean()
    print(group_name,avg)
Here is what I understand: we are saving a groupby object to "splitting" that is grouped by year. Next, we see that the type of splitting.groups is a dictionary. We iterate over the key value pairs in splitting, obtain an average, and print the key along with it's average mpg.

What is confusing me is the line "avg=group['mpg'].mean()." From what I've learned about indexing, I've never seen a value being able to be used at the beginning of the index. Since group is the value in the key value pairs, how does Python know that group['mpg'] refers to the mpg column?
Reply
#2
Your group_name and group is not a key: value pair from grouping.groups. Iterating over dataframe's groupby yields unique values of grouping variable and appropriate subset of the original dataframe. In your case group_name is yr value and group is a subset of the original auto dataframe, basically auto[auto.yr==group_name]. So group['mpg'] is column with mpg values (again, its basically part of auto['mpg'] column).

You could get similar result without groupby with:
for yr in auto.yr.unique():  # yr is like your group_name
    group = auto[auto.yr==yr]
    avg = group['mpg'].mean()
    print(yr, avg)
But usually you would use just auto.groupby('yr')['mpg'].mean() without any explicit looping.
Reply
#3
(Apr-26-2017, 08:33 PM)zivoni Wrote: Your group_name and group is not a key: value pair from grouping.groups. Iterating over dataframe's groupby yields unique values of grouping variable and appropriate subset of the original dataframe. In your case group_name is yr value and group is a subset of the original auto dataframe, basically auto[auto.yr==group_name]. So group['mpg'] is column with mpg values (again, its basically part of auto['mpg'] column).

That makes a lot of sense. Took me a few times to read over to understand but I got it all now lol. Thanks for the thorough explanation.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  AttributeError: '_tkinter.tkapp' object has no attribute 'username' Konstantin23 4 1,668 Aug-04-2023, 12:41 PM
Last Post: Konstantin23
  Python: AttributeError: 'PageObject' object has no attribute 'extract_images' Melcu54 2 3,862 Jun-18-2023, 07:47 PM
Last Post: Melcu54
  Object attribute behavior different in 2 scripts db042190 1 728 Jun-14-2023, 12:37 PM
Last Post: deanhystad
  Initiating an attribute in a class __init__: question billykid999 8 1,312 May-02-2023, 09:09 PM
Last Post: billykid999
  cx_oracle Error - AttributeError: 'function' object has no attribute 'cursor' birajdarmm 1 2,323 Apr-15-2023, 05:17 PM
Last Post: deanhystad
  Pandas AttributeError: 'DataFrame' object has no attribute 'concat' Sameer33 5 5,583 Feb-17-2023, 06:01 PM
Last Post: Sameer33
  WebDriver' object has no attribute 'find_element_by_css_selector rickadams 3 5,899 Sep-19-2022, 06:11 PM
Last Post: Larz60+
  'dict_items' object has no attribute 'sort' Calli 6 4,471 Jul-29-2022, 09:19 PM
Last Post: Gribouillis
  AttributeError: 'numpy.ndarray' object has no attribute 'load' hobbyist 8 7,095 Jul-06-2022, 10:55 AM
Last Post: deanhystad
  AttributeError: 'numpy.int32' object has no attribute 'split' rf_kartal 6 4,350 Jun-24-2022, 08:37 AM
Last Post: Anushka00

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020