Aug-23-2018, 02:51 PM
As the title entails, I am trying to remove rows from a pandas dataframe at random based on whether the given row (instance) has a certain value in some given column.
For example, suppose I had the following dataframe:
see attachment if image doesn't show*

and I wanted to remove at random 33% of the rows which have a value of 0 in the 'balon_dor_winner' column, how would I go about doing it?
I have tried the following but it hasn't worked:
For example, suppose I had the following dataframe:
see attachment if image doesn't show*

and I wanted to remove at random 33% of the rows which have a value of 0 in the 'balon_dor_winner' column, how would I go about doing it?
I have tried the following but it hasn't worked:
df.drop(df.loc[df['balon_dor_winner']==0].sample(frac=0.33).index)which didn't work and also:
df.drop(df.query('balon_dor_winner == 0').sample(frac=.33).index)but no luck so far.