Python Forum

Hi,
I have a data frame that looks like
COL1
A
A
A
A
A
B
B
B
B
B
C
D
D
D
D
E
E

I 'd like to keep observations that appear more than 3 times only. So in that case, delete the rows that have the C and the Es in COL1.

I tried something like
df_test = df_have[df_have.groupby('COL1').count() >= 3]
but doesn't work.

Any idea ? Thank you by advance ! :)

Is the list actually sorted as you have shown? It is much easier if it is.
What you do is count the number of elements, and when the element changes, store (or just print) the tuple <element, count>, but only if count > 3. Done.

Menthix

supuflounder