Python Forum
Value_Counts Question - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Value_Counts Question (/thread-2608.html)



Value_Counts Question - smw10c - Mar-28-2017

I hope you are all having a good day. I have a question in regards to the value_counts method. Why does the following code not work:


sal[sal['JobTitle'].value_counts()==1]

The error message is: 


IndexingError: Unalignable boolean Series key provided


RE: Value_Counts Question - nilamo - Mar-28-2017

sal[True] apparently doesn't make sense. If "JobTitle" is a valid key, why would a bool be valid?


RE: Value_Counts Question - smw10c - Mar-28-2017

(Mar-28-2017, 02:54 PM)nilamo Wrote: sal[True] apparently doesn't make sense. If "JobTitle" is a valid key, why would a bool be valid?


What I'm trying to do is subset the dataset to where there is only one observation of a specific Job Title.


RE: Value_Counts Question - zivoni - Mar-28-2017

You cant use your value_counts() serie to index your dataframe - value_counts returns serie with length equal to a number of unique items in column "title", different from length of original dataframe.

You need to use something more complicated, like use value counts to get unique items (counted items form index of value_counts serie) and after that use dataframe selection with .isin() function.

Simple example with fictional data;
In [27]: df = pd.DataFrame({'title': ['steward', 'chef', 'cook', 'steward', 'cook'], 'value':[13,34,23,30,17]})

In [28]: counts = df.title.value_counts()

In [29]: counts
Out[29]: 
steward    2
cook       2
chef       1
Name: title, dtype: int64

In [30]: counts[counts==1]
Out[30]: 
chef    1
Name: title, dtype: int64

In [31]: df[df.title.isin(counts[counts==1].index)]
Out[31]: 
  title  value
1  chef     34