Python Forum

Full Version: Converting Filter to If Else Statement and Count
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have a dataset with over 9000 rows and multiple columns. I would like to filter a couple columns, one by type from dropdown selection, the other to include keywords in a different column. Once filtered I would like to get a count of the remaining rows in the dataset. Currently I have the filter looking like this:

(df['Type of breach']!='HACK') & (df['Description of incident'].str.contains('bank account', na=False) | df['Description of incident'].str.contains('social security number', na=False))

I am not sure how to get a count of the filtered rows. Do I have to make the above filter into an If/Else statement? If so, how do I do so to filter from data frame and to include all of my And/Or statements?

I have tried this so far:

NewList = 0

for index,
if (df['Type of breach']!='HACK') 
    & (df['Description of incident'].str.contains('bank account', na=False) 
    | df['Description of incident'].str.contains('social security number', na=False))
    NewList +=1
and I receive this error:

File "<ipython-input-2-5037b8dfd889>", line 3 for index, ^ SyntaxError: invalid syntax
You don't need to use pure for-loop here,

# the number of rows where your condition is True
true_cnt = (df['Type of breach']!='HACK') 
    & (df['Description of incident'].str.contains('bank account', na=False) 
    | df['Description of incident'].str.contains('social security number', na=False)).sum()

# The number of rows in df (Total)

# Probably, you need this:
len(df) - true_cnt