Python Forum
Pandas - compute means per category and time
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas - compute means per category and time
#6
(Nov-12-2020, 04:49 PM)jefsummers Wrote: Not clear. Why is the 4th item, which is round 1, (5+1)/2? You have 2 round 1 entries by this time. And the 5th would be (4+2)/2? Still not getting how you arrive at your values.
And agree, groupby or other aggregating functions might work, but not by the formulas I am seeing...

OK, so once again, sorry :) Look at this df:
df = pd.DataFrame(data={ 'name':["a","a","a","b","b","c" ] , 'value':[5,4,3,1,2,1] , 'round':[1,2,3,1,2,1 ]})
In the first round I don`t compute any mean, so 'mean_per_round':[5, , , 1, , 1 ]. In the second round, I compute mean of the value from the second and first round. So 'mean_per_round':[,4.5 , , ,1.5 , ]. Similarly, in the third round I compute the average of the value from the first, second and third round. So 'mean_per_round':[, ,4 , , , ]. I work with unbalanced dataset, so I have no values of "b" in the third round and no values of "c" for the second and third round.
Putting it together, new df will look like this:

new_df = pd.DataFrame(data={ 'name':["a","a","a","b","b","c" ] , 'value':[5,4,3,1,2,1] , 'round':[1,2,3,1,2,1 ] ,  'mean_per_round':[5, 4.5 , 4, 1, 1.5, 1 ]  })
Is it clear now?

I tried groupby, but this gives me something different:
df.groupby(['name', 'round']).mean()
Reply


Messages In This Thread
RE: Pandas - compute means per category and time - by rama27 - Nov-12-2020, 07:32 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Parsing and summing time deltas (duration) onto bar + pie charts using pandas - - DRY Drone4four 2 604 Feb-10-2024, 06:04 PM
Last Post: Drone4four
  Pandas read csv file in 'date/time' chunks MorganSamage 4 1,722 Feb-13-2023, 11:24 AM
Last Post: MorganSamage
  K Means Clustering antouanet 0 692 Jan-30-2023, 01:18 PM
Last Post: antouanet
  pandas: Compute the % of the unique values in a column JaneTan 1 1,791 Oct-25-2021, 07:55 PM
Last Post: jefsummers
  Kaggle Titanic - new category placement snakes 0 1,699 Oct-18-2021, 07:53 PM
Last Post: snakes
  How does K-Means++ work in selecting initial centroids? JaneTan 1 1,918 Jul-30-2021, 08:56 AM
Last Post: Gribouillis
  Does a pandas have a date without a time? AlekseyPython 6 4,966 Feb-10-2021, 09:24 AM
Last Post: Naheed
  Using shift to compute the percent change in a time series new_to_python 6 4,198 Mar-03-2020, 07:50 PM
Last Post: new_to_python
  create 10 yearly blocks from time series using pandas Staph 1 1,977 Jul-23-2019, 12:01 PM
Last Post: Malt
  Simple String to Time within a pandas dataframe Ecniv 1 2,526 Jun-14-2019, 03:25 AM
Last Post: scidam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020