It work in a different way in Pandas and Numpy as mention @
zivoni.
Pandas objects such as Series and NumPy arrays dos not have a boolean values.
They raise
ValueError
(refuse to guess True or False).
So use normal Python
and or not
will not work.
Python:
>>> lst_1 = [1, 2, 3]
>>> bool(lst_1)
True
>>> lst_1 = [2, 8]
>>> lst_2 = [2, 8]
>>> lst_1 and lst_2
[2, 8]
Pandas:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame(np.random.randn(3,3))
>>> df
0 1 2
0 0.518276 0.511278 -1.200522
1 0.301082 0.166139 0.173871
2 -0.968949 0.840400 -0.161232
>>> df[(df > .2) & (df < 1)]
0 1 2
0 0.518276 0.511278 NaN
1 0.301082 NaN NaN
2 NaN 0.840400 NaN
>>> # Now look a bool value
>>> bool(df[(df > 1)])
Traceback (most recent call last):
File "<string>", line 301, in runcode
File "<interactive input>", line 1, in <module>
File "C:\Python34\lib\site-packages\pandas\core\generic.py", line 917, in __nonzero__
.format(self.__class__.__name__))
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> # Same error for <and or not>
Boolean indexing
Quote:Another common operation is the use of boolean vectors to filter the data. The operators are: | for or, & for and, and ~ for not. These must be grouped by using parentheses.