Posts: 21
Threads: 10
Joined: May 2021
Hi,
I have a dataframe with two columns : an 'ID' column and a column 'V1' with different values between 1 and 3.
I'd like to mark each ID with 1 if value 3 is present for this group and 0 otherwise.
The attachment is more clear. I'd like to obtain the 'Result'.
Thank you by advance !
Attached Files
Thumbnail(s)
Posts: 6,813
Threads: 20
Joined: Feb 2020
May-13-2022, 03:44 PM
(This post was last modified: May-13-2022, 03:45 PM by deanhystad.)
Posts: 21
Threads: 10
Joined: May 2021
I havn't tried much...
I 'd like to combine df.groupby('ID') and np.where(df['V1']==3, 1, 0) but I don't think it's possible...
Posts: 21
Threads: 10
Joined: May 2021
Any idea ?
Thank you by advance for your responses
Posts: 1,094
Threads: 143
Joined: Jul 2017
Caveat: I practically never use pandas, so my answer may not be too great.
I recently created some online polls to keep the students occupied. The csv file has for example 3 rows of Q1, 3 rows of Q2 and so on, similar to your 3 rows AAA, BBB.
So I adapted the poll csv, made 3 columns, id, Qnr, Result
Result contains nothing to start with, shows up in pandas as NaN.
If you search online, you can very quickly find answers for your particular problem. There is always more than 1 way to skin a cat!
1 2 3 4 5 6 7 8 9 |
import pandas as pd
csv_file = '/home/pedro/myPython/pandas/test1.csv'
df = pd.read_csv(csv_file)
df.loc[df[ 'Qnr' ] = = 'Q3' , 'Result' ] = 1
df.loc[df[ 'Qnr' ] ! = 'Q3' , 'Result' ] = 0
df[ 'Result' ] = df[ 'Qnr' ]. apply ( lambda x: 1 if x = = 'Q3' else 0 )
|
Posts: 21
Threads: 10
Joined: May 2021
I tried two things :
1) df['Result'] = df.groupby(df['ID']).apply(lambda x: np.where(df['Var1']==3,0,1))
It keeps running, but i'm pretty sure it won't work
2) df['Result'] = df['Var1'].groupby(df['ID']).apply(lambda x: 1 if x == '3' else 0)
I get the error : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I don't get why
Any idea how to sort one of these two ideas ?
Thank you by advance
Posts: 21
Threads: 10
Joined: May 2021
(May-16-2022, 01:01 AM)Pedroski55 Wrote: Caveat: I practically never use pandas, so my answer may not be too great.
I recently created some online polls to keep the students occupied. The csv file has for example 3 rows of Q1, 3 rows of Q2 and so on, similar to your 3 rows AAA, BBB.
So I adapted the poll csv, made 3 columns, id, Qnr, Result
Result contains nothing to start with, shows up in pandas as NaN.
If you search online, you can very quickly find answers for your particular problem. There is always more than 1 way to skin a cat!
1 2 3 4 5 6 7 8 9 |
import pandas as pd
csv_file = '/home/pedro/myPython/pandas/test1.csv'
df = pd.read_csv(csv_file)
df.loc[df[ 'Qnr' ] = = 'Q3' , 'Result' ] = 1
df.loc[df[ 'Qnr' ] ! = 'Q3' , 'Result' ] = 0
df[ 'Result' ] = df[ 'Qnr' ]. apply ( lambda x: 1 if x = = 'Q3' else 0 )
|
It doesn't wok because the "result" column is not grouped by ID
Posts: 1,094
Threads: 143
Joined: Jul 2017
Haha, what a wonderful thing the Internet is: instant answers
I could never have got to this on my own!
1 2 3 |
df[ 'Result' ] = df[ 'Type' ].isin(df.loc[df[ 'Var1' ].eq( 3 ), 'Type' ]).astype( int )
df[ 'Result' ] = np.where(df[ 'Type' ].isin(df.loc[df[ 'Var1' ].eq( 3 ), 'Type' ]), 1 , 0 )
df[ 'Result' ] = df[ 'Var1' ].eq( 3 ).groupby(df[ 'Type' ]).transform( 'any' ).astype( int )
|
Posts: 21
Threads: 10
Joined: May 2021
(May-16-2022, 10:24 AM)Pedroski55 Wrote: Haha, what a wonderful thing the Internet is: instant answers
I could never have got to this on my own!
1 2 3 |
df[ 'Result' ] = df[ 'Type' ].isin(df.loc[df[ 'Var1' ].eq( 3 ), 'Type' ]).astype( int )
df[ 'Result' ] = np.where(df[ 'Type' ].isin(df.loc[df[ 'Var1' ].eq( 3 ), 'Type' ]), 1 , 0 )
df[ 'Result' ] = df[ 'Var1' ].eq( 3 ).groupby(df[ 'Type' ]).transform( 'any' ).astype( int )
|
Thank you so much, it works !! :D
|