dataframe groupby with totals for certain fields

**scidam** · Oct-19-2020, 12:21 AM

Your for-loop iterates over two items (or a few number of columns)... I don't understand why it is expensive (probably, copying and concatenating huge df's are expensive).
Alternative way is to calculate cumulative values as a separate df and concatenate it to the source df, e.g.

gg=df.groupby('region').agg({'numbers':'mean', 'respondent': pd.Series.nunique}).reset_index()
gg['illness'] = 'All'
pd.concat([df, gg], ignore_index=True)

The same can be done with 'illness' and finally, without groupby (i.e. df.agg(...))

dataframe groupby with totals for certain fields

User Panel Messages

Announcements