![]() |
pandas data frame - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: pandas data frame (/thread-20735.html) |
pandas data frame - dervast - Aug-28-2019 Hi all, I would like to drop all unique entries based on a specific column value. I give an example below data = [[10105, 1], [10105, 1], [10105, 0], [20205, 0], [20205, 0], [20205, 1], [20205, 1],[80215, 1]] test=pd.DataFrame(data,columns=["ID","label"]) test Out[65]: ID label 0 10105 1 1 10105 1 2 10105 0 3 20205 0 4 20205 0 5 20205 1 6 20205 1 7 80215 1I would like to keep all rows except the last one since the ID value happens only once. All the other rows are good. Any ideas ? Thanks Alex RE: pandas data frame - ThomasL - Aug-28-2019 you already know groupby and .count test.groupby('ID').count().index # Int64Index([10105, 20205, 80215], dtype='int64', name='ID') test.groupby('ID').count().values.flatten() array([3, 4, 1], dtype=int64) |