Dec-04-2022, 11:14 PM
Hi, looking how to add sequence and group them by numbers for duplicates.
index raw count group number filename Number of Pages CRITERIA
0 1 1 1 file_753951.pdf 2 Starwars
3 4 2 1 file_654321.pdf 2 Starwars
4 5 3 1 file_456123.pdf 2 Starwars
5 6 4 1 file_548564.pdf 2 Starwars
11 12 5 2 file_351643.pdf 2 Trekky
13 14 6 2 file_789654.pdf 2 Trekky
2 3 7 3 file_321564.pdf 2 Guardians
15 16 8 3 file_963852.pdf 2 Guardians
12 13 9 3 file_741852.pdf 3 Guardians
Dec 8
1. the screenshot is my required output.
problem:
1. create group number field
2. write sequential number in group number field per criteria. should be same group number per criteria.
Thank you.
index raw count group number filename Number of Pages CRITERIA
0 1 1 1 file_753951.pdf 2 Starwars
3 4 2 1 file_654321.pdf 2 Starwars
4 5 3 1 file_456123.pdf 2 Starwars
5 6 4 1 file_548564.pdf 2 Starwars
11 12 5 2 file_351643.pdf 2 Trekky
13 14 6 2 file_789654.pdf 2 Trekky
2 3 7 3 file_321564.pdf 2 Guardians
15 16 8 3 file_963852.pdf 2 Guardians
12 13 9 3 file_741852.pdf 3 Guardians
mydata = df["Criteria"] df_getdupes = df[cfc.isin(cfc[cfc.duplicated()])].sort_values(['Criteria','Number of Pages']) display(df_getdupes) df_getdupes.to_csv('output_dupes1.csv')Updates:
Dec 8
1. the screenshot is my required output.
problem:
1. create group number field
2. write sequential number in group number field per criteria. should be same group number per criteria.
Thank you.