Python Forum

Full Version: Filtering a data frame according to number of occurences of an observation
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I have a data frame that looks like
COL1
A
A
A
A
A
B
B
B
B
B
C
D
D
D
D
E
E

I 'd like to keep observations that appear more than 3 times only. So in that case, delete the rows that have the C and the Es in COL1.

I tried something like
df_test = df_have[df_have.groupby('COL1').count() >= 3]
but doesn't work.

Any idea ? Thank you by advance ! :)
Is the list actually sorted as you have shown? It is much easier if it is.
What you do is count the number of elements, and when the element changes, store (or just print) the tuple <element, count>, but only if count > 3. Done.