Python Forum
Can anyone explain me whats happening here?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Can anyone explain me whats happening here?
#1
I'm not sure if this goes in homework or datascience but it is related to pandas so here goes

df[df['TotalPayBenefits'].max()]
why does this result in a error while

df[df['TotalPayBenefits']==df['TotalPayBenefits'].max()]
works fine



Error:
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2645 try: -> 2646 return self._engine.get_loc(key) 2647 except KeyError: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 567595.43 During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) <ipython-input-24-d77051c168c8> in <module> ----> 1 df[df['TotalPayBenefits'].max()] ~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key) 2798 if self.columns.nlevels > 1: 2799 return self._getitem_multilevel(key) -> 2800 indexer = self.columns.get_loc(key) 2801 if is_integer(indexer): 2802 indexer = [indexer] ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2646 return self._engine.get_loc(key) 2647 except KeyError: -> 2648 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2649 indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2650 if indexer.ndim > 1 or indexer.size > 1: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 567595.43
the KeyError is kind of my answer while it doesnt fetch me the name of the person whose salary it corresponds with
Reply
#2
In the first snippet, you are selecting a dataframe with a single value - the max of that column. Pandas does not know if you want the rows with that value in the same column, a different column, or whatever. In the second snippet you select where two values are equal - the value in the oclumn TotalBenefits and the value of the max(). It's kind of like saying "if the value in TotalBenefits equals the max of TotalBenefits, include that row in the selection.

And of course, remember you are doing selections, not creating a new dataframe. To do that you need to use copy. That's a common gotcha.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020