Python Forum

Hello everybody, I have a data set:

animals = pd.DataFrame({'kind': ['cat', 'dog', 'cat', 'dog'],
'height': [9.1, 6.0, 9.5, 34.0],
'weight': [7.9, 7.5, 9.9, 198.0]})

Output:kind  height  weight
0  cat     9.1     7.9
1  dog     6.0     7.5
2  cat     9.5     9.9
3  dog    34.0   198.0

It is simple enough to use groupby to, say get the statistics of height and weight by kind:

animals.groupby("kind").agg(
       min_height=('height', 'min'),
        max_height=('height', 'max'),
      average_weight=('weight', np.mean),
   )

Output: min_height  max_height  average_weight
kind                                        
cat          9.1         9.5            8.90
dog          6.0        34.0          102.75

My challenge is if I want to get the mean of weight of all animals that has a height between 9.0 and 10.0? Can I still use groupby? Thanks!

You can filter entire dataframe first, e.g.

animals[(9<animals.height)&(animals.height<10)].groupby("kind").agg(
       min_height=('height', 'min'),
        max_height=('height', 'max'),
      average_weight=('weight', np.mean),
   )

namy77

scidam