65

I would like to get the index of a 2 dimensional Numpy array that matches a row. For example, my array is this:

vals = np.array([[0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3],
                 [0, 0],
                 [1, 0],
                 [2, 0],
                 [0, 1],
                 [1, 1],
                 [2, 1],
                 [0, 2],
                 [1, 2],
                 [2, 2],
                 [0, 3],
                 [1, 3],
                 [2, 3]])

I would like to get the index that matches the row [0, 1] which is index 3 and 15. When I do something like numpy.where(vals == [0 ,1]) I get...

(array([ 0,  3,  3,  4,  5,  6,  9, 12, 15, 15, 16, 17, 18, 21]), array([0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0]))

I want index array([3, 15]).

3 Answers 3

97

You need the np.where function to get the indexes:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

Or, as the documentation states:

If only condition is given, return condition.nonzero()

You could directly call .nonzero() on the array returned by .all:

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

To dissassemble that:

>>> vals == (0, 1)
array([[ True, False],
       [False, False],
       ...
       [ True, False],
       [False, False],
       [False, False]], dtype=bool)

and calling the .all method on that array (with axis=1) gives you True where both are True:

>>> (vals == (0, 1)).all(axis=1)
array([False, False, False,  True, False, False, False, False, False,
       False, False, False, False, False, False,  True, False, False,
       False, False, False, False, False, False], dtype=bool)

and to get which indexes are True:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)

or

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)

I find my solution a bit more readable, but as unutbu points out, the following may be faster, and returns the same value as (vals == (0, 1)).all(axis=1):

>>> (vals[:, 0] == 0) & (vals[:, 1] == 1)
Sign up to request clarification or add additional context in comments.

1 Comment

I tend to favor np.nonzero over the np.where alias, to avoid confusion with the completely different np.where(bool, if_true, if_false) function
16
In [5]: np.where((vals[:,0] == 0) & (vals[:,1]==1))[0]
Out[5]: array([ 3, 15])

I'm not sure why, but this is significantly faster than
np.where((vals == (0, 1)).all(axis=1)):

In [34]: vals2 = np.tile(vals, (1000,1))

In [35]: %timeit np.where((vals2 == (0, 1)).all(axis=1))[0]
1000 loops, best of 3: 808 µs per loop

In [36]: %timeit np.where((vals2[:,0] == 0) & (vals2[:,1]==1))[0]
10000 loops, best of 3: 152 µs per loop

Comments

3

Using the numpy_indexed package that I created, you can simply write:

import numpy_indexed as npi
print(np.flatnonzero(npi.contains([[0, 1]], vals)))

2 Comments

Nice that you made this. You ought to make your affiliation explicit to comply with site rules. I notice that you did in another answer that came up in my research recently.
can you show some benchmarks otherwise it wont convince me maybe others of using it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.