Weighted average with multiple weights and groups - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Weighted average with multiple weights and groups (/thread-21721.html) |
Weighted average with multiple weights and groups - amyd - Oct-11-2019 I am a beginner in Python and I am trying to improve my code - so I would appreciate some advice on how to improve the efficiency of the following. I have the following dataset: petdata = { 'animal' : ['dog', 'cat', 'fish'], 'male_1' : [0.57, 0.72, 0.62], 'female_1' : [0.43, 0.28, 0.38], 'age_01_1': [0.10,0.16,0.15], 'age_15_1':[0.17,0.29,0.26], 'age_510_1':[0.15,0.19,0.19], 'age_1015_1':[0.18,0.16,0.17], 'age_1520_1':[0.20,0.11,0.12], 'age_20+_1':[0.20,0.09,0.10], 'male_2' : [0.57, 0.72, 0.62], 'female_2' : [0.43, 0.28, 0.38], 'age_01_2': [0.10,0.16,0.15], 'age_15_2':[0.17,0.29,0.26], 'age_510_2':[0.15,0.19,0.19], 'age_1015_2':[0.18,0.16,0.17], 'age_1520_2':[0.20,0.11,0.12], 'age_20+_2':[0.20,0.09,0.10], 'weight_1': [10,20,30], 'weight_2':[40,50,60] } df = pd.DataFrame(petdata)I want to calculate a weighted average for the animals in my dataset using weight_1 for all the variable that end with "_1" and weight_2 for all the variables that end with "_2". I am doing it in this way at the moment: df['male_wav_1']=np.nansum((df['male_1']*df['weight_1'])/df['weight_1'].sum()) df['female_wav_1']=np.nansum((df['female_1']*df['weight_1'])/df['weight_1'].sum()) df['male_wav_2']=np.nansum((df['male_2']*df['weight_2'])/df['weight_2'].sum()) df['female_wav_2']=np.nansum((df['female_2']*df['weight_2'])/df['weight_2'].sum())And this is for every single column in my dataframe. I realise this is not very neat, so can anyone give me some advice on how to improve the process? I have tried to:
But I was unsuccessful with both. The issue not in the reshaping, I can do that, but they I am not clear on how to apply the different weights to the different groups I have in my data. Many thanks for any help. |