For all elements in a Numpy array, return their index in another array

Question

Given an array:

big_array = np.array(['dog', 'cat', 'dog', 'dog', 'dog', 'cat', 'cat', 'dog'])

I want to get the position of each of these elements, in a smaller array:

small_array = np.array(['dog', 'cat'])

It should return:

[0, 1, 0, 0, 0, 1, 1, 0]

It would be the equivalent of:

[np.where(i == small_array)[0][0] for i in big_array]

Can it be done without list comprehension, preferably with a Numpy function?

Are they guaranteed to exist? If not, what value should be returned for elements not found? — John Zwinck
– John Zwinck, Commented Nov 3, 2019 at 2:36

John Zwinck · Accepted Answer · 2019-11-03 02:40:25Z

2

If you're willing to sort small_array, this will do it:

small_array.sort() # in-place, or `x = np.sort(small_array)` for a sorted copy
np.searchsorted(small_array, big_array)

answered Nov 3, 2019 at 2:40

John Zwinck

252k44 gold badges346 silver badges459 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Nicolas Gervais Over a year ago

I'm not sure where you are going with your first line, but your second line was sufficient to make it work.

John Zwinck Over a year ago

@NicolasGervais: See the documentation: docs.scipy.org/doc/numpy/reference/generated/… - searchsorted() requires that the first argument is sorted, or you have to provide a third argument to tell it how to sort (as in Austin's answer). If you do neither of those things, it will give you wrong results (including for your example input).

Nicolas Gervais Over a year ago

This is what I was trying to do, and it works if I only use your second line of code.

Austin · Accepted Answer · 2019-11-03 02:40:28Z

2

There's is no numpy function as far as I know, but you could also do a combination of argsort and searchsorted something like:

import numpy as np

big_array = np.array(['dog', 'cat', 'dog', 'dog', 'dog', 'cat', 'cat', 'dog'])

small_array = np.array(['dog', 'cat'])

sortd = np.argsort(small_array)
res = sortd[np.searchsorted(small_array[sortd], big_array)]

print(res)
# [0 1 0 0 0 1 1 0]

answered Nov 3, 2019 at 2:40

Austin

26.1k4 gold badges28 silver badges52 bronze badges

Comments

wwii · Accepted Answer · 2019-11-03 02:56:46Z

Compare with broadcasting and find index of True in the result along the last axis.

>>> a = np.array(['dog', 'cat', 'dog', 'dog', 'dog', 'cat', 'cat', 'dog'])
>>> b = np.array(['dog','cat'])
>>> c = a[:,None] == b
>>> c.argmax(axis=-1)
array([0, 1, 0, 0, 0, 1, 1, 0], dtype=int64)

>>> a[:,None] == b
array([[ True, False],
       [False,  True],
       [ True, False],
       [ True, False],
       [ True, False],
       [False,  True],
       [False,  True],
       [ True, False]])

Collectives™ on Stack Overflow

For all elements in a Numpy array, return their index in another array

3 Answers 3

3 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Linked

Related