Python Forum

Full Version: Value_Counts Question
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I hope you are all having a good day. I have a question in regards to the value_counts method. Why does the following code not work:


sal[sal['JobTitle'].value_counts()==1]

The error message is: 


IndexingError: Unalignable boolean Series key provided
sal[True] apparently doesn't make sense. If "JobTitle" is a valid key, why would a bool be valid?
(Mar-28-2017, 02:54 PM)nilamo Wrote: [ -> ]sal[True] apparently doesn't make sense. If "JobTitle" is a valid key, why would a bool be valid?


What I'm trying to do is subset the dataset to where there is only one observation of a specific Job Title.
You cant use your value_counts() serie to index your dataframe - value_counts returns serie with length equal to a number of unique items in column "title", different from length of original dataframe.

You need to use something more complicated, like use value counts to get unique items (counted items form index of value_counts serie) and after that use dataframe selection with .isin() function.

Simple example with fictional data;
In [27]: df = pd.DataFrame({'title': ['steward', 'chef', 'cook', 'steward', 'cook'], 'value':[13,34,23,30,17]})

In [28]: counts = df.title.value_counts()

In [29]: counts
Out[29]: 
steward    2
cook       2
chef       1
Name: title, dtype: int64

In [30]: counts[counts==1]
Out[30]: 
chef    1
Name: title, dtype: int64

In [31]: df[df.title.isin(counts[counts==1].index)]
Out[31]: 
  title  value
1  chef     34