Python Forum
Add group number for duplicates
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Add group number for duplicates
#1
Hi, looking how to add sequence and group them by numbers for duplicates.

index raw count group number filename Number of Pages CRITERIA
0 1 1 1 file_753951.pdf 2 Starwars
3 4 2 1 file_654321.pdf 2 Starwars
4 5 3 1 file_456123.pdf 2 Starwars
5 6 4 1 file_548564.pdf 2 Starwars
11 12 5 2 file_351643.pdf 2 Trekky
13 14 6 2 file_789654.pdf 2 Trekky
2 3 7 3 file_321564.pdf 2 Guardians
15 16 8 3 file_963852.pdf 2 Guardians
12 13 9 3 file_741852.pdf 3 Guardians

mydata = df["Criteria"]
df_getdupes = df[cfc.isin(cfc[cfc.duplicated()])].sort_values(['Criteria','Number of Pages'])
display(df_getdupes)
df_getdupes.to_csv('output_dupes1.csv')
Updates:
Dec 8
1. the screenshot is my required output.

problem:
1. create group number field
2. write sequential number in group number field per criteria. should be same group number per criteria.

Thank you.

Attached Files

Thumbnail(s)
   
Reply


Messages In This Thread
Add group number for duplicates - by atomxkai - Dec-04-2022, 11:14 PM
RE: Add group number for duplicates - by deanhystad - Dec-06-2022, 12:32 AM
RE: Add group number for duplicates - by atomxkai - Dec-08-2022, 06:08 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Counting Duplicates in large Data Set jmair 3 1,166 Dec-07-2022, 09:42 AM
Last Post: paul18fr
  Pandas Indexing with duplicates energerecontractuel 3 2,907 Mar-07-2019, 12:57 AM
Last Post: scidam
  How to group variables & check correlation of group variables wrt single variable SriRajesh 2 3,004 May-23-2018, 03:01 PM
Last Post: SriRajesh
  jupyter pandas remove duplicates help okl 3 7,550 Feb-25-2018, 01:11 PM
Last Post: glidecode

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020