How to get 2d array of indices from 1d array?

Question

I'm looking for an efficient way to return indices for a 2d array based on values in a 1d array. I currently have a nested for loop set up that is painfully slow.

Here is some example data and what I want to get:

data2d = np.array( [  [1,2] , [1,3] ,[3,4], [1,2] , [7,9] ])

data1d = np.array([1,2,3,4,5,6,7,8,9])

I would like to return the indices where data2d is equal to data1d. My desired output would be this 2d array:

locs = np.array([[0, 1], [0, 2], [2, 3], [0, 1], [6, 8]])

The only thing I've come up with is the nested for loop:

locs = np.full((np.shape(data2d)), np.nan)

for i in range(0, 5):
    for j in range(0, 2):
        loc_val = np.where(data1d == data2d[i, j])
        loc_val = loc_val[0]
        locs[i, j] = loc_val

This would be fine for a small set of data but I have 87,600 2d grids that are each 428x614 grid points.

Yes it is sorted for the data I'm working with. And yes all points are guaranteed to exist. — Dan McEvoy
– Dan McEvoy, Commented Jan 25, 2019 at 20:19

cs95 · Accepted Answer · 2019-01-25 20:19:19Z

1

Use np.searchsorted:

np.searchsorted(data1d, data2d.ravel()).reshape(data2d.shape)

array([[0, 1],
       [0, 2],
       [2, 3],
       [0, 1],
       [6, 8]])

searchsorted performs binary search with the ravelled data2d. The result is then reshaped.

Another option is to build an index and query it in constant time. You can do this with pandas' Index API.

import pandas as pd

idx = pd.Index([1,2,3,4,5,6,7,8,9])
idx
#  Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')

idx.get_indexer(data2d.ravel()).reshape(data2d.shape)

array([[0, 1],
       [0, 2],
       [2, 3],
       [0, 1],
       [6, 8]])

answered Jan 25, 2019 at 20:19

cs95

406k106 gold badges744 silver badges794 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Dan McEvoy Over a year ago

Wow, those both seem like great solutions, thanks! Now I just need to see how it performs on the larger data.

Aiden Zhao · Accepted Answer · 2019-01-25 20:29:45Z

0

This should be fast also

import numpy as np
data2d = np.array( [  [1,2] , [1,3] ,[3,4], [1,2] , [7,9] ])
data1d = np.array([1,2,3,4,5,6,7,8,9])
idxdict = dict(zip(data1d,range(len(data1d))))
locs = data2d
for i in range(len(locs)):
    for j in range(len(locs[i])):
        locs[i][j] = idxdict[locs[i][j]]

edited Jan 25, 2019 at 20:29

answered Jan 25, 2019 at 20:18

Aiden Zhao

6634 silver badges15 bronze badges

Collectives™ on Stack Overflow

How to get 2d array of indices from 1d array?

2 Answers 2

1 Comment

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Related