![]() |
Dataframe mean calculation problem: do we have to loop? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Dataframe mean calculation problem: do we have to loop? (/thread-29308.html) |
Dataframe mean calculation problem: do we have to loop? - sparkt - Aug-27-2020 Suppose we have a very simple dataframe. import pandas as pd df = pd.DataFrame({'A': [1, 2, 5, 6, 7], 'B': [20, 30, 50, 90, 80]}) print(df)A B 0 1 20 1 2 30 2 5 50 3 6 90 4 7 80 The question is simple: How do we create a third row 'C' such that the following is true? df.C[0] = mean of all the 10 numbers df.C[1] = mean of 2, 5, 6, 7, 30, 50, 90, 80 df.C[2] = mean of 5, 6, 7, 50, 90, 80 df.C[3] = mean of 6, 7, 90, 80 df.C[4] = mean of 7, 80 I've read dozens of relevant tutorials online but all of them only teach how to get a single mean for a single row. Any help would be much appreciated. RE: Dataframe mean calculation problem - sparkt - Aug-28-2020 I got it, though I expected something much simpler. df['C'] = 0.0 for i in df.index: df['C'][i] = df[['A', 'B']][i:].mean().mean() print(df)But do we have to use this for loop to go through all values? One of the best things of dataframe is that it can deal with a complex frame once only when applying function, without the need of going through looping which is not good for optimal performance. I hope there's a more elegant solution! |