I have a dataframe with two columns 'ProductID' and 'Balance'.
Some 'Balance' columns have values =0 (df['Balance']=0).
My dataframe contains different products.
Index ProductID Balance
1 10 100
2 20 0
3 30 200
4 20 150
5 20 240
6 40 100
7 30 200
I want to impute the zero balance at index 2 with average balance based on the product ID=20 ((0 +150 + 240)/3 =130) .
I am using the below code:
1) First getting the balance_average based on the ProductID.
balance_average = df.pivot_table(values='Balance', index='ProductID')
2) Identifying the zero balance rows
zero_bool = (df['Balance'] == 0)
3) Applying balance_average
df.loc[zero_bool, 'Balance'] = df.loc[zero_bool, 'ProductID'].apply(lambda x: balance_average[x])
when i execute the above code in python 3.6 I am getting KeyError: '20'.
20 is the corresponding ProductID of the first zero balance column.
Could someone tell me what's wrong with the above code?
Some 'Balance' columns have values =0 (df['Balance']=0).
My dataframe contains different products.
Index ProductID Balance
1 10 100
2 20 0
3 30 200
4 20 150
5 20 240
6 40 100
7 30 200
I want to impute the zero balance at index 2 with average balance based on the product ID=20 ((0 +150 + 240)/3 =130) .
I am using the below code:
1) First getting the balance_average based on the ProductID.
balance_average = df.pivot_table(values='Balance', index='ProductID')
2) Identifying the zero balance rows
zero_bool = (df['Balance'] == 0)
3) Applying balance_average
df.loc[zero_bool, 'Balance'] = df.loc[zero_bool, 'ProductID'].apply(lambda x: balance_average[x])
when i execute the above code in python 3.6 I am getting KeyError: '20'.
20 is the corresponding ProductID of the first zero balance column.
Could someone tell me what's wrong with the above code?