Feb-07-2020, 02:10 PM
Hello guys!
I have a dataframe.
But chains are not unique. (A-B-C = B-C-A = C-A-B <> B-A-C)
I can’t catch using which methods in python I have to sort(using parallel shift) units of all chains to drop duplicates.
Could please anybody help?
I have a dataframe.
import pandas as pd data = {'num':[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'name':['281.3891.3891.281', '3891.281.281.3891', '1162.5645.5645.500835.500835.1162', '5645.500835.500835.1162.1162.5645', '500835.1162.1162.5645.5645.500835', '1349.1162.1162.5645.5645.500835.500835.1349', '1162.5645.5645.500835.500835.1349.1349.1162', '5645.500835.500835.1349.1349.1162.1162.5645', '500835.1349.1349.1162.1162.5645.5645.500835', '5645.1162.1162.500835.500835.5645', '1162.500835.500835.5645.5645.1162', '500835.5645.5645.1162.1162.500835' ]} df = pd.DataFrame(data) print(df)Each line in dataframe is a chain (start point = end point).
But chains are not unique. (A-B-C = B-C-A = C-A-B <> B-A-C)
I can’t catch using which methods in python I have to sort(using parallel shift) units of all chains to drop duplicates.
Could please anybody help?