Python Forum
How to Filter a Dataframe
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to Filter a Dataframe
#1
I have this data
Quote:Active CreateDate CreateTime UserName S_ID
TRUE 2020-11-03 00:00:00 12:04:29 M001AE8 []
TRUE 2020-11-03 00:00:00 12:04:29 M001AE8 []
TRUE 2020-11-03 00:00:00 12:02:35 M001AF0 []
TRUE 2019-06-11 00:00:00 01:32:12 M017A70 []
TRUE 2020-10-22 00:00:00 09:09:06 S345678 ['S345678']
TRUE 2020-10-22 00:00:00 09:09:06 S567456 ['S567456']
TRUE 2020-03-16 00:00:00 07:41:05 S789077 ['S789077']
TRUE 2020-03-16 00:00:00 07:41:05 S123456 ['S123456']
TRUE 2014-07-05 00:00:00 06:38:59 SvcPcc []
TRUE 2014-07-05 00:00:00 06:38:59 SvcPcc []
TRUE 2020-07-27 00:00:00 12:15:24 SvcPcc []

How do I filter this df to return data where S_ID is NOT []

N.B I pulled this data from excel and loaded it into df
Reply
#2
Quote:df = df[df['S_ID'] != '[]']
Reply
#3
Hi,

I tried your code:
data = data[data['S_ID'] != '[]']
But I still get the whole dataset, not filtered

Output:
Active ... S_ID 6 True ... [] 7 True ... [] 8 True ... [] 9 True ... [] 11 True ... [S345678] 12 True ... [S567456] 13 True ... [S789077] 14 True ... [S123456] 16 True ... [] 17 True ... [] 18 True ... []
Reply
#4
(Nov-16-2020, 05:14 AM)mfkzolo Wrote: Hi,

I tried your code:
data = data[data['S_ID'] != '[]']
But I still get the whole dataset, not filtered

Output:
Active ... S_ID 6 True ... [] 7 True ... [] 8 True ... [] 9 True ... [] 11 True ... [S345678] 12 True ... [S567456] 13 True ... [S789077] 14 True ... [S123456] 16 True ... [] 17 True ... [] 18 True ... []

I even tried this:
data_bool = data['S_ID'] == '[]'
data = data[data_bool]
but still nothing.
Reply
#5
I doubt S_ID column had whitespaces,

import pandas as pd
df=pd.read_excel(open('b111.xlsx', 'rb'),sheet_name='Sheet1') 
print(df)

#adding len coluumn
df['id_char_length']=df['id'].astype(str).map(len)
print(df)
#remove column
df=df.drop(['id_char_length'],axis=1)

#fiter without trim spaces
df = df[df['id'] != '[]']
print(df)

#filter with trim spaces
df = df[df['id'].str.strip() != '[]']
print(df)
Output:
C:\Users\sandeep\Downloads>py b11 active createdate createtime username id 0 True 2020-11-03 12:04:29 M001AE8 [] 1 True 2020-11-03 12:04:29 M001AE8 [] 2 True 2020-11-03 12:02:35 M001AF0 [] 3 True 2019-06-11 01:32:12 M017A70 [] 4 True 2020-10-22 09:09:06 S345678 ['S345678'] 5 True 2020-10-22 09:09:06 S567456 ['S567456'] 6 True 2020-03-16 07:41:05 S789077 ['S789077'] 7 True 2020-03-16 07:41:05 S123456 ['S123456'] 8 True 2014-07-05 06:38:59 SvcPcc [] 9 True 2014-07-05 06:38:59 SvcPcc [] 10 True 2020-07-27 12:15:24 SvcPcc [] active createdate createtime username id id_char_length 0 True 2020-11-03 12:04:29 M001AE8 [] 10 <<<< 1 True 2020-11-03 12:04:29 M001AE8 [] 2 2 True 2020-11-03 12:02:35 M001AF0 [] 2 3 True 2019-06-11 01:32:12 M017A70 [] 2 4 True 2020-10-22 09:09:06 S345678 ['S345678'] 11 5 True 2020-10-22 09:09:06 S567456 ['S567456'] 11 6 True 2020-03-16 07:41:05 S789077 ['S789077'] 11 7 True 2020-03-16 07:41:05 S123456 ['S123456'] 11 8 True 2014-07-05 06:38:59 SvcPcc [] 21 <<<< 9 True 2014-07-05 06:38:59 SvcPcc [] 2 10 True 2020-07-27 12:15:24 SvcPcc [] 2 active createdate createtime username id 0 True 2020-11-03 12:04:29 M001AE8 [] 4 True 2020-10-22 09:09:06 S345678 ['S345678'] 5 True 2020-10-22 09:09:06 S567456 ['S567456'] 6 True 2020-03-16 07:41:05 S789077 ['S789077'] 7 True 2020-03-16 07:41:05 S123456 ['S123456'] 8 True 2014-07-05 06:38:59 SvcPcc [] active createdate createtime username id 4 True 2020-10-22 09:09:06 S345678 ['S345678'] 5 True 2020-10-22 09:09:06 S567456 ['S567456'] 6 True 2020-03-16 07:41:05 S789077 ['S789077'] 7 True 2020-03-16 07:41:05 S123456 ['S123456']
Best Regards,
Sandeep.

GANGA SANDEEP KUMAR
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Filter data based on a value from another dataframe column and create a file using lo pawanmtm 1 417 Jul-15-2020, 06:20 PM
Last Post: pawanmtm
  Filter value from DataFrame apply a function and save to xlsx zinho 1 630 Dec-22-2019, 03:54 PM
Last Post: zinho

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020