numpy, numpy... can't live with 'em, can't live without 'em. There are so many ways to do stuff in numpy that this is just overwhelming:
- I have numpy array of integers in shape of 15,5
- I want to find how many rows have two or more integers in common to row at index 1
So I try to divide it into subproblems and there are plenty:
- how to compare 1d arrays (rows)
- how to find number of common elements
- how to apply comparison to all rows
- how to count rows which have two or more common elements
Following is based on sample array:
As paul18fr already suggested - there is intersec1d. How can one use it?
How to find number of common elements?
numpy has handy
Problem solved - we know how to find number of common integers. However, if we think about it - we actually don't want number of common integers - we need to know whether condition is met or not i.e is <= 2 common integers. So we can adjust this:
This is nice and all but how to apply this to every row in array? There is apply_along_axis which can be used. However, in order to do so we need function which will be applied to all rows. As we have solution which we want to apply to every row we can write such a function with no effort at all:
How to count rows which have two or more common elements
We have array of Trues and Falses and we need to count Trues. What we do? We count nonzeros:
>>> np. Display all 600 possibilities? (y or n)I admit that the problem is still mystery to me but based on my assumptions I would define my own little problem and try to solve it:
- I have numpy array of integers in shape of 15,5
- I want to find how many rows have two or more integers in common to row at index 1
So I try to divide it into subproblems and there are plenty:
- how to compare 1d arrays (rows)
- how to find number of common elements
- how to apply comparison to all rows
- how to count rows which have two or more common elements
Following is based on sample array:
import numpy as np arr = np.array([ [3, 8, 21, 31, 37], [15, 32, 34, 38, 40], [3, 12, 20, 26, 37], [6, 7, 15, 21, 30], [8, 10, 14, 27, 31], [10, 15, 26, 37, 41], [26, 29, 32, 36, 39], [6, 9, 11, 13, 35], [15, 18, 30, 31, 37], [15, 17, 24, 26, 34], [2, 3, 23, 28, 35], [1, 3, 13, 40, 43], [3, 4, 23, 29, 30], [7, 12, 22, 23, 33], [2, 5, 7, 30, 40] ])How to compare 1d arrays?
As paul18fr already suggested - there is intersec1d. How can one use it?
>>> np.intersect1d(arr[0], arr[1]) array([], dtype=int64) >>> np.intersect1d(arr[9], arr[1]) array([15, 34])Problem solved - we know how to find common integers.
How to find number of common elements?
numpy has handy
.size
for that. Building on solution of previous problem:>>> np.intersect1d(arr[0], arr[1]).size 0 >>> np.intersect1d(arr[9], arr[1]).size 2
Problem solved - we know how to find number of common integers. However, if we think about it - we actually don't want number of common integers - we need to know whether condition is met or not i.e is <= 2 common integers. So we can adjust this:
>>> 2 <= np.intersect1d(arr[0], arr[1]).size False >>> 2 <= np.intersect1d(arr[9], arr[1]).size TrueHow to apply it to all rows
This is nice and all but how to apply this to every row in array? There is apply_along_axis which can be used. However, in order to do so we need function which will be applied to all rows. As we have solution which we want to apply to every row we can write such a function with no effort at all:
def at_least_two_common(row, test_array): return 2 <= np.intersect1d(row, test_array).sizeNow we apply this function to all rows:
>>> np.apply_along_axis(at_least_two_common, 1, arr, arr[1]) [False True False False False False False False False True False False False False False]We observe that there are two Trues - row itself and actual match. So in order to get correct result we should deduct row itself.
How to count rows which have two or more common elements
We have array of Trues and Falses and we need to count Trues. What we do? We count nonzeros:
>>> np.count_nonzero(np.apply_along_axis(at_least_two_common, 1, arr, arr[1])) 2As we observed earlier this result includes row itself, so little adjustment (-1) must be made and whole solutions will look like (of course numpy must be imported and arr initialized too):
def at_least_two_common(row, test_array): return 2 <= np.intersect1d(row, test_array).size print(np.count_nonzero(np.apply_along_axis(at_least_two_common, 1, arr, arr[1])) - 1)This might or might not be helpful to solve your problem. As I already mentioned - numpy can be overwhelming and if I provided just those three lines it wouldn't be that helpful. But knowing the steps stitched together in last line it should be pretty obvious what is going on and maybe gives some ideas how to approach your problem (I am sure that this little problem I had can be solved in gazillion other ways using numpy-s built-in methods).
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.