Python Forum

Full Version: Using groupby on non-categorical values
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello everybody, I have a data set:
animals = pd.DataFrame({'kind': ['cat', 'dog', 'cat', 'dog'],
'height': [9.1, 6.0, 9.5, 34.0],
'weight': [7.9, 7.5, 9.9, 198.0]})
Output:
kind height weight 0 cat 9.1 7.9 1 dog 6.0 7.5 2 cat 9.5 9.9 3 dog 34.0 198.0
It is simple enough to use groupby to, say get the statistics of height and weight by kind:

animals.groupby("kind").agg(
       min_height=('height', 'min'),
        max_height=('height', 'max'),
      average_weight=('weight', np.mean),
   )
    
Output:
min_height max_height average_weight kind cat 9.1 9.5 8.90 dog 6.0 34.0 102.75
My challenge is if I want to get the mean of weight of all animals that has a height between 9.0 and 10.0? Can I still use groupby? Thanks!
You can filter entire dataframe first, e.g.

animals[(9<animals.height)&(animals.height<10)].groupby("kind").agg(
       min_height=('height', 'min'),
        max_height=('height', 'max'),
      average_weight=('weight', np.mean),
   )