Python Forum
How to show up the duplicate?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to show up the duplicate?
#1
Hi there,
this is my data set:
user_id timestamp group landing_page converted
0 851104 2017-01-21 22:11:48.556739 control old_page 0
1 804228 2017-01-12 08:01:45.159739 control old_page 0
2 661590 2017-01-11 16:55:06.154213 treatment new_page 0
3 853541 2017-01-08 18:28:03.143765 treatment new_page 0
4 864975 2017-01-21 01:52:26.210827 control old_page 1

I know there is one duplicate in user_id. I check it with this query:
sum(df2.user_id.duplicated())
Now I want to know this user id. How can I do this?

Thanks in advance!
Reply
#2
Hello, you can use filter:
ids = df2['user_id']
ids_duplicates = df2[ids.isin(ids[ids.duplicated()])]
this will return all the rows where user_id is one of the duplicates and you can easily get unique user_id values out of there:
list(set(ids_duplicates.user_id))
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  PIL Image im.show() no show! Pedroski55 2 990 Sep-12-2022, 10:19 PM
Last Post: Pedroski55
  PIL Image im.show() no show! Pedroski55 6 4,998 Feb-08-2022, 06:32 AM
Last Post: Pedroski55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020