Error Message: TypeError: unhashable type: 'set' - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Error Message: TypeError: unhashable type: 'set' (/thread-18102.html) |
Error Message: TypeError: unhashable type: 'set' - twinpiques - May-06-2019 Hello. I am attempting to compare two columns sar_details_sent_norm_trigrams_ and caap_details_sent_norm_trigrams_ in a Pandas data frame. There are other columns as well, but these are the two I am comparing. I'm essentially wanting to keep records where the text values for the two columns are the same. I've executed a couple of approaches, however, I keep getting the following error message: TypeError: unhashable type: 'set' So, I either need to resolve why I am receiving this and fix it or try another approach, of course. Any advice would be greatly appreciated. Thanks. Code snippet: # Set with unique terms df_sar['sar_details_sent_norm_trigrams_unique'] = df_sar['sar_details_sent_norm_trigrams_'].apply(lambda x: set([trigram for sent in x for trigram in sent])) # Set with unique terms df_caap['caap_details_sent_norm_trigrams_unique'] = df_caap['caap_details_sent_norm_trigrams_'].apply(lambda x: set([trigram for sent in x for trigram in sent])) #Attempt 1: df_caap[df_caap.caap_details_sent_norm_trigrams_unique.isin(df_sar.sar_details_sent_norm_trigrams_unique)] #Attempt 2: set(df_caap.caap_details_sent_norm_trigrams_unique).intersection(set(df_sar.sar_details_sent_norm_trigrams_unique))TypeError Traceback (most recent call last) <ipython-input-171-2c2bb5551c7e> in <module>() 21 #set(df1.columns).intersection(set(df2.columns)) 22 ---> 23 set(df_caap.caap_details_sent_norm_trigrams_unique).intersection(set(df_sar.sar_details_sent_norm_trigrams_unique)) TypeError: unhashable type: 'set' RE: Error Message: TypeError: unhashable type: 'set' - micseydel - May-07-2019 You can't put a set in a set because sets can only contain immutable (hashable) types. You can convert your set to a tuple or a frozenset to make it immutable and qualify for being put into a set. RE: Error Message: TypeError: unhashable type: 'set' - twinpiques - May-08-2019 Ahhh, thank you. Much appreciated! RE: Error Message: TypeError: unhashable type: 'set' - DeaD_EyE - May-08-2019 Quick and dirty solution with less Python knowledge: hashable_data = tuple(set(ITERABLE))Mutable objects don't have a hash, because they can mutate. Immutable objects doesn't change, so they have have a hash. There is also a built-in type, called frozenset and yes, it does what it sounds like. This is an immutable set, which has an hash. You can make the test: # will fail {set(): 42} # is ok {frozenset(): 42} Try this: df_sar['sar_details_sent_norm_trigrams_unique'] = df_sar['sar_details_sent_norm_trigrams_'].apply(lambda x: frozenset([trigram for sent in x for trigram in sent]))And you can remove the square brackets, then it's a generator expression, which is consumed by frozenset (saves memory). Otherwise first a list from the set is created in memory, then it's applied to the dataframe. df_sar['sar_details_sent_norm_trigrams_unique'] = df_sar['sar_details_sent_norm_trigrams_'].apply(lambda x: frozenset(trigram for sent in x for trigram in sent)) RE: Error Message: TypeError: unhashable type: 'set' - twinpiques - May-22-2019 (May-08-2019, 05:32 PM)DeaD_EyE Wrote: Quick and dirty solution with less Python knowledge: This is great - thank you. |