Python Forum

Full Version: boolean array: looking for all rows where all is True
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi

I want to get rows where all columns are True; I have supposed it has been "simple", but i do not understand why the following code does not work: what i'm missing?

Thanks

Paul

M = np.array([[False, True, False, True],
              [True, True, True, True],     # OK
              [True, True, False, True],
              [True, True, True, False],
              [True, True, True, True]])    # OK

loc = np.where(M[:, 0]) and np.where(M[:, 1]) and np.where(M[:, 2]) and np.where(M[:, 3])
print(f"loc = {loc}")
Output:
loc = (array([0, 1, 2, 4], dtype=int64),)
There is built-in numpy.all:

import numpy as np

arr = np.array([[False, True, False, True],
              [True, True, True, True],
              [True, True, False, True],
              [True, True, True, False],
              [True, True, True, True]])

print(np.all(arr, axis=1))

# ->[False  True False False  True]
Thanks for the reference for the "np.all" function; nontheless even after reading the doc, i still do not understand the reason why axis should be "1" instead of "0" as usually for rows in a 2D array (of course I tried with axis=0) Think
The index for all works the same way as the index for other numpy calls. If you print the array shape, axis 0 corresponds to the first dimention, axis 1 the second dimension and so on.
import numpy as np

arr = np.array([
    [0, 0, 0],
    [0, 0, 1],
    [0, 1, 1],
    [1, 1, 1]])

print("Rows", arr[0, :], arr[1, :], arr[2, :], arr[3, :])
print("Columns", arr[:, 0], arr[:, 1], arr[:, 2])

print(f'\nShape = {arr.shape}', 'np.all(arr)', sep='\n')
print('axis 0', np.all(arr, axis=0))
print('axis 1', np.all(arr, axis=1))
print('arr[0, :] + arr[1, :] =', arr[0, :] + arr[1, :])

print('\nnp.sum(arr)', sep='\n')
print('axis 0', np.sum(arr, axis=0))
print('axis 1', np.sum(arr, axis=1))
Output:
Rows [0 0 0] [0 0 1] [0 1 1] [1 1 1] Columns [0 0 0 1] [0 0 1 1] [0 1 1 1] Shape = (4, 3) np.all(arr) axis 0 [False False False] axis 1 [False False False True] arr[0, :] + arr[1, :] = [0 0 1] np.sum(arr) axis 0 [1 2 3] axis 1 [0 1 2 3]
In this example the array has 4 rows (shape[0] == 4) and 3 columns (shape[1] == 3). When I sum along axis 0 the result is row[0] + row[1] + row[2] + row[3]. When you sum along axis 0 you sum rows, and the result is the sum for each column. When I sum along axis 1 the result is column[0] + column[1] + column[2]. When you sum along axis 1 you sum columns, and the result is the sum of each row.

The same is true for np.any() and np.all(). np.all(arr, axis=0) and's row[0] and row[1] and row[2] and row[3]. When you "and" (np.all) along axis 0, you "and" the rows, and the result is the "and" for each column.

Does that clear things up?
Ok I figures out my mistake; thanks