comparing each rows of two matrix

PhysChem · Apr-18-2019, 06:17 AM

I have two numpy-matrices ("A" and "B"). I want to compare every row-vector in "A" matrix with every row-vector in "B" matrix.

Here is my (not working) code. The expected output is 5.

import numpy as np
a=np.matrix([[1,2,3],[42,68,69],[1,2,3],[85,89,95]])
b=np.matrix([[42,68,69],[1,2,3],[85,89,95], [42,68,69]])


found_one=0
for i in range(3):
    for j in range(3):
        if a[i,]==b[j,]]:
            found_one=found_one+1
            
print(found_one)

Any suggestion?

**Gribouillis** · Apr-18-2019, 08:29 AM

I see a solution using collections.Counter

from collections import Counter
import numpy as np
a=np.matrix([[1,2,3],[42,68,69],[1,2,3],[85,89,95]])
b=np.matrix([[42,68,69],[1,2,3],[85,89,95], [42,68,69]])

ca, cb = (Counter(tuple(row) for row in m.A) for m in (a, b))
found = sum( v * cb.get(k, 0) for k, v in ca.items())
print(found)

**perfringo** · (This post was last modified: Apr-18-2019, 09:11 AM by perfringo.)

If I correctly understand the objective (which in doubt) in 'pure' Python it could be done this way:

>>> a = [[1,2,3],[42,68,69],[1,2,3],[85,89,95]]
>>> b= [[42,68,69],[1,2,3],[85,89,95], [42,68,69]]
>>> len([(x, y) for x in a for y in b if x == y])
5

To 'translate' this into numpy array it needs little adjustments:

>>> import numpy as np
>>> a = np.matrix([[1,2,3],[42,68,69],[1,2,3],[85,89,95]])
>>> b = np.matrix([[42,68,69],[1,2,3],[85,89,95], [42,68,69]])
>>> len([(x, y) for x in a for y in b if np.array_equal(x, y)])
5

PhysChem · Apr-18-2019, 12:03 PM

Thank you for the answers!

DeaD_EyE · (This post was last modified: Apr-18-2019, 12:20 PM by DeaD_EyE.)

Use itertools.product, to avoid nested loops:

[x for x,y in itertools.product(a,b) if np.array_equal(x,y)]

There is also a hint on StackOverflow how to make a Cartesian product with numpy:

**perfringo** · Apr-18-2019, 12:54 PM

(Apr-18-2019, 12:19 PM)DeaD_EyE Wrote: Use itertools.product, to avoid nested loops:
[x for x,y in itertools.product(a,b) if np.array_equal(x,y)]

This is very good point.

To get answer OP looking for:

>>> sum(1 for x, y in product(a, b) if np.array_equal(x, y))
5

**Gribouillis** · Apr-18-2019, 02:14 PM

Note that the product method has complexity O(A*B) while the hashtable method has O(A+B) complexity. For large data, the Counter code should be faster, while the product method is probably faster for small data due to the current implementation.

PhysChem · Apr-21-2019, 03:13 PM

Hi all!

Thank you for the solutions. I applied the Counter-code, because the matrices have lots of rows (much more than in my question).

PhysChem · May-14-2019, 09:22 AM

Hi, after a long pause, I started to play with Python again. I have a question:

How can I print which row-vektors were identical, if I am using solution given by Gribouillis? (I want to print the ordinal number 'ordinal number' of the identical rows in both matrices, and then the coordinates of the identical vectors.

PhysChem · (This post was last modified: May-17-2019, 05:54 PM by PhysChem.)

The only idea I have:
Transforming the matrices into two lists, and a nested for cycle...

a_lista=a.tolist()
b_lista=b.tolist()
#"a" and "b" were my previously defined np.matrices
for i in range(4):
   for j in range(4):
      if a_lista[i]==b_lista[j]:
         print(i,j,a_lista[i])

Unfortunately, my "solution" is realy crap, because I want to use this on far bigger matrices :-(

comparing each rows of two matrix

User Panel Messages

Announcements