Aug-14-2020, 02:13 PM
Hello,
I have a dataframe, which represents the names of combinations, the people who characterize those combinations, and an average score per combination. Going through the dataframe from top to bottom, I would like to extract the combinations, so that one combination does not have the same person as another combination. Concretely, there should not be a person in several combinations.
Although not very elegant, I managed to create this algorithm. unfortunately I am getting an error that just only messes me up when I apply a filter on my dataframe. Here is a repeatable code that illustrates my example:
Thank you.
I have a dataframe, which represents the names of combinations, the people who characterize those combinations, and an average score per combination. Going through the dataframe from top to bottom, I would like to extract the combinations, so that one combination does not have the same person as another combination. Concretely, there should not be a person in several combinations.
Although not very elegant, I managed to create this algorithm. unfortunately I am getting an error that just only messes me up when I apply a filter on my dataframe. Here is a repeatable code that illustrates my example:
import pandas as pd import numpy as np #I initialize my lists compo_df = [] group_df = [] #I create my dataframe df = pd.DataFrame({'nom_combinaison': ['Compo1', 'Compo2', 'Compo3', 'Compo4','Compo5', 'Compo6', 'Compo7', 'Compo8'], 'prenom': ["[Personne3, Personne5, Personne6]", "[Personne3, Personne4, Personne11]", "[Personne9, Personne11, Personne12]", "[Personne8, Personne9, Personne12]", "[Personne1, Personne9, Personne12]", "[Personne3, Personne4, Personne8]", "[Personne1, Personne3, Personne4]", "[Personne3, Personne6, Personne10]"] , 'moyenne': np.random.randn(8)}) #HERE, THE LINE THAT MAKES MY PROGRAM BUILD. THE LATTER WORKS OK IF DELETED df = df.loc[(df.moyenne > 0.2)] #By default, I add the first element of combinations and people to my lists compo_df.append(df["nom_combinaison"][0]) group_df.append(df["prenom"][0]) a_remplir = [] for i in range(0, len(df)): #I'm browsing my dataframe if len(set(df["prenom"][i]) & set(a_remplir)) == 0: #If the list "a_fill" does not contain an element similar to df ["first name"] [i]: print("combi %s : valeur %s"%(df["nom_combinaison"][i], df["prenom"][i])) #To help me, I display the elements compo_df.append(df["nom_combinaison"][i]) #I add the name of the combination to my list ... group_df.append(df["prenom"][i])# .... as well as the people print("Je met") for value in df["prenom"][i]: #And I add to "a_fill" the values of the list of df ["firstname"] [i] to avoid having groups with similar people between them a_remplir.append(value) print(value) else: print("combi %s : valeur %s"%(df["nom_combinaison"][i], df["prenom"][i])) print("similitude : je rejette") print("\n") print(list(set(compo_df))) print((group_df))The error returned is:
combi Combinaison_1 : valeur ['Personne3', 'Personne4', 'Personne11'] Je met Personne3 Personne4 Personne11 Traceback (most recent call last): File "delete.py", line 20, in <module> if len(set(df["prenom"][i]) & set(a_remplir)) == 0: #Si la liste "a_remplir" ne contient pas d'élément semblable à df["prenom"][i] : File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\series.py", line 871, in __getitem__ result = self.index.get_value(self, key) File "C:\ProgramData\Miniconda3\lib\site-packages\pandas\core\indexes\base.py", line 4405, in get_value return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None)) File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc File "pandas\_libs\hashtable_class_helper.pxi", line 998, in pandas._libs.hashtable.Int64HashTable.get_item File "pandas\_libs\hashtable_class_helper.pxi", line 1005, in pandas._libs.hashtable.Int64HashTable.get_item KeyError: 1What happened ?
Thank you.