Python Forum
Check if a value is present in each group - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Check if a value is present in each group (/thread-37215.html)



Check if a value is present in each group - Menthix - May-13-2022

Hi,

I have a dataframe with two columns : an 'ID' column and a column 'V1' with different values between 1 and 3.
I'd like to mark each ID with 1 if value 3 is present for this group and 0 otherwise.

The attachment is more clear. I'd like to obtain the 'Result'.

Thank you by advance !


RE: Check if a value is present in each group - deanhystad - May-13-2022

What have you tried?


RE: Check if a value is present in each group - Menthix - May-13-2022

I havn't tried much...

I 'd like to combine df.groupby('ID') and np.where(df['V1']==3, 1, 0) but I don't think it's possible...


RE: Check if a value is present in each group - Menthix - May-15-2022

Any idea ?
Thank you by advance for your responses


RE: Check if a value is present in each group - Pedroski55 - May-16-2022

Caveat: I practically never use pandas, so my answer may not be too great.

I recently created some online polls to keep the students occupied. The csv file has for example 3 rows of Q1, 3 rows of Q2 and so on, similar to your 3 rows AAA, BBB.

So I adapted the poll csv, made 3 columns, id, Qnr, Result

Result contains nothing to start with, shows up in pandas as NaN.

If you search online, you can very quickly find answers for your particular problem. There is always more than 1 way to skin a cat!

import pandas as pd
csv_file = '/home/pedro/myPython/pandas/test1.csv'
df = pd.read_csv(csv_file)
# try with these 2 I can't see how to combine them into 1 line!
df.loc[df['Qnr'] == 'Q3', 'Result'] = 1
df.loc[df['Qnr'] != 'Q3', 'Result'] = 0

# or like this
df['Result'] = df['Qnr'].apply(lambda x: 1 if x == 'Q3' else 0)



RE: Check if a value is present in each group - Menthix - May-16-2022

I tried two things :

1) df['Result'] = df.groupby(df['ID']).apply(lambda x: np.where(df['Var1']==3,0,1))
It keeps running, but i'm pretty sure it won't work

2) df['Result'] = df['Var1'].groupby(df['ID']).apply(lambda x: 1 if x == '3' else 0)
I get the error : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I don't get why



Any idea how to sort one of these two ideas ?

Thank you by advance


RE: Check if a value is present in each group - Menthix - May-16-2022

(May-16-2022, 01:01 AM)Pedroski55 Wrote: Caveat: I practically never use pandas, so my answer may not be too great.

I recently created some online polls to keep the students occupied. The csv file has for example 3 rows of Q1, 3 rows of Q2 and so on, similar to your 3 rows AAA, BBB.

So I adapted the poll csv, made 3 columns, id, Qnr, Result

Result contains nothing to start with, shows up in pandas as NaN.

If you search online, you can very quickly find answers for your particular problem. There is always more than 1 way to skin a cat!

import pandas as pd
csv_file = '/home/pedro/myPython/pandas/test1.csv'
df = pd.read_csv(csv_file)
# try with these 2 I can't see how to combine them into 1 line!
df.loc[df['Qnr'] == 'Q3', 'Result'] = 1
df.loc[df['Qnr'] != 'Q3', 'Result'] = 0

# or like this
df['Result'] = df['Qnr'].apply(lambda x: 1 if x == 'Q3' else 0)

It doesn't wok because the "result" column is not grouped by ID


RE: Check if a value is present in each group - Pedroski55 - May-16-2022

Haha, what a wonderful thing the Internet is: instant answers

I could never have got to this on my own!

df['Result'] = df['Type'].isin(df.loc[df['Var1'].eq(3), 'Type']).astype(int)
df['Result'] = np.where(df['Type'].isin(df.loc[df['Var1'].eq(3), 'Type']), 1, 0)
df['Result'] = df['Var1'].eq(3).groupby(df['Type']).transform('any').astype(int)



RE: Check if a value is present in each group - Menthix - May-16-2022

(May-16-2022, 10:24 AM)Pedroski55 Wrote: Haha, what a wonderful thing the Internet is: instant answers

I could never have got to this on my own!

df['Result'] = df['Type'].isin(df.loc[df['Var1'].eq(3), 'Type']).astype(int)
df['Result'] = np.where(df['Type'].isin(df.loc[df['Var1'].eq(3), 'Type']), 1, 0)
df['Result'] = df['Var1'].eq(3).groupby(df['Type']).transform('any').astype(int)


Thank you so much, it works !! :D