Python Forum
Removing rows at random based on the value of a specific column
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Removing rows at random based on the value of a specific column
#1
As the title entails, I am trying to remove rows from a pandas dataframe at random based on whether the given row (instance) has a certain value in some given column.

For example, suppose I had the following dataframe:

see attachment if image doesn't show*
İmage


and I wanted to remove at random 33% of the rows which have a value of 0 in the 'balon_dor_winner' column, how would I go about doing it?

I have tried the following but it hasn't worked:

df.drop(df.loc[df['balon_dor_winner']==0].sample(frac=0.33).index)
which didn't work and also:

df.drop(df.query('balon_dor_winner == 0').sample(frac=.33).index)
but no luck so far.

Attached Files

Thumbnail(s)
   
Reply
#2
What about adding another column with random numbers and dropping if winner is 0 and new column < 0.33?
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#3
Yeah that sounds okay, but I would have really liked to do it as described, I'm sure it could be done in R it's just I want start using pandas more.
Reply
#4
Well, instead of adding a column you could make a boolean list with both conditions and subset with that.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#5
Got it, my mistake was not assigning it to a variable as so:
df=df.drop(df.query('salary == 0').sample(frac=.41).index)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Merging rows and adding columns based on matching index pythonnewbie78 3 748 Dec-24-2023, 11:51 AM
Last Post: Pedroski55
  Make unique id in vectorized way based on text data column with similarity scoring ill8 0 861 Dec-12-2022, 03:22 AM
Last Post: ill8
  Pandas Dataframe Filtering based on rows mvdlm 0 1,396 Apr-02-2022, 06:39 PM
Last Post: mvdlm
  New Dataframe Column Based on Several Conditions nb1214 1 1,782 Nov-16-2021, 10:52 PM
Last Post: jefsummers
  Pandas Data frame column condition check based on length of the value aditi06 1 2,655 Jul-28-2021, 11:08 AM
Last Post: jefsummers
  Setting the x-axis to a specific column in a dataframe devansing 0 1,993 May-23-2021, 12:11 AM
Last Post: devansing
Question [Solved] How to refer to dataframe column name based on a list lorensa74 1 2,238 May-17-2021, 07:02 AM
Last Post: lorensa74
  Add column based on others timste 8 3,950 Apr-03-2021, 07:39 AM
Last Post: devesh_sahu
  Extracting rows based on condition on one column Robotguy 2 2,167 Aug-07-2020, 02:27 AM
Last Post: Robotguy
  Dropping Rows From A Data Frame Based On A Variable JoeDainton123 1 2,186 Aug-03-2020, 02:05 AM
Last Post: scidam

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020