Find matching rows in 2 dimensional numpy array

Question

I would like to get the index of a 2 dimensional Numpy array that matches a row. For example, my array is this:

vals = np.array([[0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3],
                 [0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3]])

I would like to get the index that matches the row [0, 1] which is index 3 and 15. When I do something like numpy.where(vals == [0 ,1]) I get...

(array([ 0,  3,  3,  4,  5,  6,  9, 12, 15, 15, 16, 17, 18, 21]), array([0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0]))

I want index array([3, 15]).

Aaron Hall · Accepted Answer · 2018-05-19 01:36:05Z

You need the np.where function to get the indexes:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

Or, as the documentation states:

If only condition is given, return condition.nonzero()

You could directly call .nonzero() on the array returned by .all:

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

To dissassemble that:

>>> vals == (0, 1)
array([[ True, False],
       [False, False],
       ...
       [ True, False],
       [False, False],
       [False, False]], dtype=bool)

and calling the .all method on that array (with axis=1) gives you True where both are True:

>>> (vals == (0, 1)).all(axis=1)
array([False, False, False,  True, False, False, False, False, False,
       False, False, False, False, False, False,  True, False, False,
       False, False, False, False, False, False], dtype=bool)

and to get which indexes are True:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

or

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

I find my solution a bit more readable, but as unutbu points out, the following may be faster, and returns the same value as (vals == (0, 1)).all(axis=1):

>>> (vals[:, 0] == 0) & (vals[:, 1] == 1)

I tend to favor np.nonzero over the np.where alias, to avoid confusion with the completely different np.where(bool, if_true, if_false) function

unutbu · Accepted Answer · 2014-09-13 14:24:35Z

16

In [5]: np.where((vals[:,0] == 0) & (vals[:,1]==1))[0]
Out[5]: array([ 3, 15])

I'm not sure why, but this is significantly faster than
np.where((vals == (0, 1)).all(axis=1)):

In [34]: vals2 = np.tile(vals, (1000,1))

In [35]: %timeit np.where((vals2 == (0, 1)).all(axis=1))[0]
1000 loops, best of 3: 808 µs per loop

In [36]: %timeit np.where((vals2[:,0] == 0) & (vals2[:,1]==1))[0]
10000 loops, best of 3: 152 µs per loop

edited Sep 13, 2014 at 14:24

answered Sep 13, 2014 at 13:25

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

Comments

swimfar2 · Accepted Answer · 2023-08-30 10:39:57Z

3

Using the numpy_indexed package that I created, you can simply write:

import numpy_indexed as npi
print(np.flatnonzero(npi.contains([[0, 1]], vals)))

edited Aug 30, 2023 at 10:39

swimfar2

1878 bronze badges

answered Sep 13, 2014 at 15:26

Eelco Hoogendoorn

10.8k1 gold badge46 silver badges43 bronze badges

2 Comments

Mad Physicist Over a year ago

Nice that you made this. You ought to make your affiliation explicit to comply with site rules. I notice that you did in another answer that came up in my research recently.

Jiadong Over a year ago

can you show some benchmarks otherwise it wont convince me maybe others of using it.

Collectives™ on Stack Overflow

Find matching rows in 2 dimensional numpy array

3 Answers 3

1 Comment

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

2 Comments

Linked

Related