Python Forum
Why any(0) does not work here?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Why any(0) does not work here?
#1
Hello, I know that to select all rows having a value exceeding 2 or less than -2, we can use the any method on a boolean DataFrame. Considering the following example:

data = pd.DataFrame(np.random.randn(10,4))
In [35]: data

Out[35]: 
          0         1         2         3
0 -0.525633 -0.697104 -0.531631 -1.006075
1  1.961877  0.529459 -2.014627  0.310788
2 -0.298181 -0.140505 -0.437984 -1.180706
3 -0.178647  0.254879 -1.384031  0.200253
4  0.287127 -0.111793  2.289889 -1.727586
5 -0.806481  0.069195  0.690085 -2.182153
6  1.201117  2.221714  1.199904  0.528438
7  0.833238  0.900948  0.731922 -1.821565
8  0.291749  1.134619 -1.554773  1.877738
9  0.000611 -0.459774 -0.410249  0.116427


In[36]: data[(np.abs(data) > 2).any(1)]
Out[36]: 
          0         1         2         3
1  1.961877  0.529459 -2.014627  0.310788
4  0.287127 -0.111793  2.289889 -1.727586
5 -0.806481  0.069195  0.690085 -2.182153
6  1.201117  2.221714  1.199904  0.528438
The program applies the any method by columns to select all the rows having a value greater than 2 or less than -2.
In column 0, there is no row that satisfies the condition. In column 1, row 6. In column 2, rows 1 and 4 while in column 3, row 5. As a result, rows 1, 4, 5 and 6 are displayed.

Now supposing that I want to select any column having a value greater than 2 or less than -2. I use any(0) to apply the any method by rows. However, doing so gave me an error. How come? Is there a way to fix this?

In [37]: data[(np.abs(data) > 2).any(0)]   

---------------------------------------------------------------------------
IndexingError                             Traceback (most recent call last)
<ipython-input-37-bcf27c252f2c> in <module>
----> 1 data[(np.abs(data) > 2).any(0)]

~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2789         # Do we have a (boolean) 1d indexer?
   2790         if com.is_bool_indexer(key):
-> 2791             return self._getitem_bool_array(key)
   2792 
   2793         # We are left with two options: a single key, and a collection of keys,

~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in _getitem_bool_array(self, key)
   2841         # check_bool_indexer will throw exception if Series key cannot
   2842         # be reindexed to match DataFrame rows
-> 2843         key = check_bool_indexer(self.index, key)
   2844         indexer = key.nonzero()[0]
   2845         return self._take_with_is_copy(indexer, axis=0)

~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in check_bool_indexer(index, key)
   2314         if mask.any():
   2315             raise IndexingError(
-> 2316                 "Unalignable boolean Series provided as "
   2317                 "indexer (index of the boolean Series and of "
   2318                 "the indexed object do not match)."

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
Reply
#2
By default [] (__getitem__) operation applied by rows (when the argument is boolean array). You can use loc,e.g.

data.loc[:, (np.abs(data) > 2).any(0)]
Reply
#3
Thanks
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020