2

I have a large (90k x 90k) numpy ndarray and I need to zero out a block of it. I have a list of about 30k indices that indicate which rows and columns need to be zero. The indices aren't necessarily contiguous, so a[min:max, min:max] style slicing isn't possible.

As a toy example, I can start with a 2D array of non-zero values, but I can't seem to write zeros the way I expect.

import numpy as np

a = np.ones((6, 8))
indices = [2, 3, 5]
# I thought this would work, but it does not.
# It correctly writes to (2,2), (3,3), and (5,5), but not all
# combinations of (2, 3), (2, 5), (3, 2), (3, 5), (5, 2), or (5, 3)
a[indices, indices] = 0.0
print(a)

[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 1. 1. 1. 1. 1.]
 [1. 1. 1. 0. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 0. 1. 1.]]
# I thought this would fix that problem, but it doesn't change the array.
a[indices, :][:, indices] = 0.0
print(a)

[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]]

In this toy example, I'm hoping for this result.

[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]]

I could probably write a cumbersome loop or build some combinatorically huge list of indices to do this, but it seems intuitive that this must be supported in a cleaner way, I just can't find the syntax to make it happen. Any ideas?

2
  • 2
    Explore using np.ix_. Commented Jan 16 at 23:09
  • @hpaulj, I had never seen this function, and wouldn't have known to search for it. And it's exactly what I need. Thanks!! I wrote an answer, but if you'd rather take the credit, I'll delete mine. Commented Jan 16 at 23:19

1 Answer 1

3

Based on hpaulj's comment, I came up with this, which works perfectly on the toy example.

a[np.ix_(indices, indices)] = 0.0
print(a)

[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 0. 0. 1. 0. 1. 1.]]

It also worked beautifully on the real data. It was faster than I expected and didn't noticeably increase memory consumption. Exhausting memory has been a constant concern with these giant arrays.

Sign up to request clarification or add additional context in comments.

1 Comment

Look at the result of np.ix_(indices, indices). It is easy to create the same two arrays without ix_. Ir's a convenience function, not a necessity. Those 2 arrays are broadcastable.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.