Feb-11-2021, 10:21 PM
I'm not sure how it could be done in pandas. I'm also not sure what the match_id does, except act as an auto-increment, but I included it anyway for the fun of it.
Here's how it could be done with a simple loop:
Here's how it could be done with a simple loop:
>>> import pandas as pd >>> data = {'id1': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3], ... 'id2': [5, 6, 7, 8, 9, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9], ... 'score': [30, 19, 29, 25, 14, 27, 26, 24, 23, 12, 20, 21, 16, 15, 17]} >>> df = pd.DataFrame(data) >>> df = df.sort_values(by="score", ascending=False) >>> df id1 id2 score 0 1 5 30 2 1 7 29 5 2 5 27 6 2 6 26 3 1 8 25 7 2 7 24 8 2 8 23 11 3 6 21 10 3 5 20 1 1 6 19 14 3 9 17 12 3 7 16 13 3 8 15 4 1 9 14 9 2 9 12 >>> matches = {"id1": [], "id2": [], "score": [], "match_id": []} >>> for row in df.values: ... id1, id2, score = row ... if id1 not in matches["id1"] and id2 not in matches["id2"]: ... matches["id1"].append(id1) ... matches["id2"].append(id2) ... matches["score"].append(score) ... matches["match_id"].append(len(matches["match_id"])+1) ... >>> matches {'id1': [1, 2, 3], 'id2': [5, 6, 9], 'score': [30, 26, 17], 'match_id': [1, 2, 3]} >>> new_df = pd.DataFrame(matches) >>> new_df id1 id2 score match_id 0 1 5 30 1 1 2 6 26 2 2 3 9 17 3