Numpy Selecting Elements given row and column index arrays

Question

I have row indices as a 1d numpy array and a list of numpy arrays (list as same length as the size of the row indices array. I want to extract values corresponding to these indices. How can I do it ?

This is an example of what I want as output given the input

A = np.array([[2, 1, 1, 0, 0],
              [3, 0, 2, 1, 1],
              [0, 0, 2, 1, 0],
              [0, 3, 3, 3, 0],
              [0, 1, 2, 1, 0],
              [0, 1, 3, 1, 0],
              [2, 1, 3, 0, 1],
              [2, 0, 2, 0, 2],
              [3, 0, 3, 1, 2]])

row_ind = np.array([0,2,4])
col_ind = [np.array([0, 1, 2]), np.array([2, 3]), np.array([1, 2, 3])]

Now, I want my output as a list of numpy arrays or list of lists as

[np.array([2, 1, 1]), np.array([2, 1]), np.array([1, 2, 1])]

My biggest concern is the efficiency. My array A is of dimension 20K x 10K.

you normally would use np.ix_ for that but to clarify: do you mean you want to avoid advanced indexing for performance reasons? — FObersteiner
– FObersteiner, Commented Feb 18, 2020 at 11:14
@MrFuppes I am fine with advanced indexing. But I do not see how can I apply advanced indexing here. — Shew
– Shew, Commented Feb 18, 2020 at 11:17
what I was relating to was that np.ix_ uses advanced indexing. see my answer below. make sure to watch memory consumption if you process large arrays... — FObersteiner
– FObersteiner, Commented Feb 18, 2020 at 11:24
Since col_ind vary in length, I think you'll require a loop. — hpaulj
– hpaulj, Commented Feb 18, 2020 at 11:36
In your example, shouldn't the third array of the output be np.array([1, 2, 1]) ? — kuzand
– kuzand, Commented Feb 18, 2020 at 22:19

FObersteiner · Accepted Answer · 2020-02-19 08:33:43Z

As @hpaulj commented, likely, you won't be able to avoid looping - e.g.

import numpy as np

A = np.array([[2, 1, 1, 0, 0],
              [3, 0, 2, 1, 1],
              [0, 0, 2, 1, 0],
              [0, 3, 3, 3, 0],
              [0, 1, 2, 1, 0],
              [0, 1, 3, 1, 0],
              [2, 1, 3, 0, 1],
              [2, 0, 2, 0, 2],
              [3, 0, 3, 1, 2]])


row_ind = np.array([0,2,4])
col_ind = [np.array([0, 1, 2]), np.array([2, 3]), np.array([1, 2, 3])]

# make sure the following code is safe...
assert row_ind.shape[0] == len(col_ind)

# 1) select row (A[r, :]), then select elements (cols) [col_ind[i]]:
output = [A[r, :][col_ind[i]] for i, r in enumerate(row_ind)]

# output
# [array([2, 1, 1]), array([2, 1]), array([1, 2, 1])]

Another way to do this could be to use np.ix_ (still requires looping). Use with caution though for very large arrays; np.ix_ uses advanced indexing - in contrast to basic slicing, it creates a copy of the data instead of a view - see the docs.

Collectives™ on Stack Overflow

Numpy Selecting Elements given row and column index arrays

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related