2

I have a numpy array of 1650 rows and 1275 columns containing 0s and 255s. I want to get the index of every first zero in the row and store it in an array. I used for loop to achieve that. Here is the example code

#new_arr is a numpy array and k is an empty array 
for i in range(new_arr.shape[0]):
  if not np.all(new_arr[i,:]) == 255:
   x = np.where(new_arr[i,:]==0)[0][0]
   k.append(x)
  else:
   k.append(-1)

It takes around 1.3 seconds for 1650 rows. Is there any other way or function to get the indices array in a much faster way?

7
  • What if there's no zero in a row? Commented May 31, 2017 at 14:00
  • I need answer for that too! Im sorry I didnt even think abt it. Commented May 31, 2017 at 14:07
  • So, what answer do you need for such a case? Commented May 31, 2017 at 14:07
  • Would that be nice if I put -1 if there are no zereos in a row? Commented May 31, 2017 at 14:10
  • Updated my post on it. Commented May 31, 2017 at 14:13

1 Answer 1

5

One approach would be to get mask of matches with ==0 and then get argmax along each row, i.e argmax(axis=1) that gives us the first matching index for each row -

(arr==0).argmax(axis=1)

Sample run -

In [443]: arr
Out[443]: 
array([[0, 1, 0, 2, 2, 1, 2, 2],
       [1, 1, 2, 2, 2, 1, 0, 1],
       [2, 1, 0, 1, 0, 0, 2, 0],
       [2, 2, 1, 0, 1, 2, 1, 0]])

In [444]: (arr==0).argmax(axis=1)
Out[444]: array([0, 6, 2, 3])

Catching non-zero rows (if we can!)

To facilitate for rows that won't have any zero, we need to do one more step of work, with some masking -

In [445]: arr[2] = 9

In [446]: arr
Out[446]: 
array([[0, 1, 0, 2, 2, 1, 2, 2],
       [1, 1, 2, 2, 2, 1, 0, 1],
       [9, 9, 9, 9, 9, 9, 9, 9],
       [2, 2, 1, 0, 1, 2, 1, 0]])

In [447]: mask = arr==0

In [448]: np.where(mask.any(1), mask.argmax(1), -1)
Out[448]: array([ 0,  6, -1,  3])
Sign up to request clarification or add additional context in comments.

10 Comments

By itself, this could give an unexpected answer if a row happens to be all zeroes, though.
@DSM Guess you meant no zero? :)
0.0018 seconds. OMG! For loops suck when it comes to numpy.
Though when I had no zereos in a row,the resultant array had zeroes in that case.
@BharathShetty Yeah, because there was no zero. That's why we needed the fix as listed in the second section.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.