Python Forum

Full Version: complex sort in dataframe
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello guys!

I have a dataframe.

import pandas as pd 
data = {'num':[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
       'name':['281.3891.3891.281', 
       '3891.281.281.3891', 
       '1162.5645.5645.500835.500835.1162', 
       '5645.500835.500835.1162.1162.5645',
       '500835.1162.1162.5645.5645.500835',
       '1349.1162.1162.5645.5645.500835.500835.1349',
       '1162.5645.5645.500835.500835.1349.1349.1162',
       '5645.500835.500835.1349.1349.1162.1162.5645',
       '500835.1349.1349.1162.1162.5645.5645.500835',
       '5645.1162.1162.500835.500835.5645',
       '1162.500835.500835.5645.5645.1162',
       '500835.5645.5645.1162.1162.500835'
       ]} 
df = pd.DataFrame(data) 
print(df)
Each line in dataframe is a chain (start point = end point).
But chains are not unique. (A-B-C = B-C-A = C-A-B <> B-A-C)
I can’t catch using which methods in python I have to sort(using parallel shift) units of all chains to drop duplicates.

Could please anybody help?