NumPy 2D array iteration speed

Question

I have a loop which fills up a 2-D NumPy array with pixel information for PIL, this array is called 'Shadows'. The colours are either white or blue. I want to build up a final image from these where the white is dominant. i.e. if one image in the loop has a blue pixel on co-ordinate x,y and another image in the loop has a white pixel on the same co-ordinate then the final pixel will be white.

This is currently done by:

import math, random, copy
import numpy as np
from PIL import Image, ImageDraw

colours = {0: (255,255,255), 1: (0,0,255)}

#width and height of area of interest
w = 100 #100 meter
h = 200 #200 meter

NumberOfDots = 10
DotRadius = 20
NumberOfRuns = 3

Final = np.array([[colours[0] for x in range(w)] for y in range(h)])
Shadows = np.array([[colours[0] for x in range(w)] for y in range(h)])

for SensorNum in range(NumberOfRuns):

  Shadows = np.array([[colours[0] for x in range(w)] for y in range(h)])

  for dot in range(NumberOfDots):

    ypos = random.randint(DotRadius, h-DotRadius)
    xpos = random.randint(DotRadius, w-DotRadius)

    for i in range(xpos - DotRadius, xpos + DotRadius):
      for j in range(ypos - DotRadius, ypos + DotRadius):
          if math.sqrt((xpos - i)**2 + (ypos - j)**2) < DotRadius:
            Shadows[j][i] = colours[1]

  im = Image.fromarray(Shadows.astype('uint8')).convert('RGBA')
  im.save('result_test_image'+str(SensorNum)+'.png')

  #This for loop below is the bottle-neck. Can its speed be improved?
  if SensorNum > 0:
    for i in range(w):
      for j in range(h):
        #White space dominates.
        #(pixel by pixel) If the current images pixel is white and the unfinshed Final
        #images pixel is blue then set the final pixel to white.
        if np.all(Shadows[j][i]==colours[0]) and np.all(Final[j][i]==colours[1]):
          Final[j][i] = colours[0]
  else:
    Final = copy.deepcopy(Shadows)

im = Image.fromarray(Final.astype('uint8')).convert('RGBA')
im.save('result_final_test.png')

The final nested for loop is what I am interested in improving. This works fine but the iteration is a huge bottle neck. Is there anyway to this quicker by using some vectoring etc?

I can't reproduce your code. Numpy arrays can't store tuples afaik. Thus I guess Shadows in contrast to what you say is not a ndarray. Could you please give a working example including all imported modules? — JE_Muc
– JE_Muc, Commented May 9, 2018 at 13:53
I updated the code so you should be able to run it. Unless I've missed something it looks like ndarrays can hold tuples :-) — Rob
– Rob, Commented May 9, 2018 at 16:26
Thanks! Of course there is a vectorized approach for this situation. I'll post a solution in a few minutes. But no: ndarrays can't hold tuples. Numpy is converting the tuple to a an array with shape (3, ). The calculation time for this is nearly 2 times more than using arrays. — JE_Muc
– JE_Muc, Commented May 11, 2018 at 9:29
Sorry for the delay. Thanks for pointing out that ndarrays can't hold tuples, I genuinely didn't know that... — Rob
– Rob, Commented May 14, 2018 at 10:52

JE_Muc · Accepted Answer · 2018-05-11 12:28:58Z

Of course it is possible to vectorize the last for loop in your code, since each iteration is not depending on values calculated in an iteration before. But honestly it was not as easy as I thought it would be...

My approach is around 800 to 1000 times faster than your current loop. I replaced the upper-case array and variable names with lower-case names using underscores. Upper-case is usually reserved for classes in python. That's the reason for the strange code-colouring in your question.

if sensor_num > 0:
    mask = (  # create a mask where the condition is True
        ((shadows[:, :, 0] == 255) &  # R=255
         (shadows[:, :, 1] == 255) &  # G=255
         (shadows[:, :, 2] == 255)) &  # B=255
        ((final[:, :, 0] == 0) &  # R=0
         (final[:, :, 1] == 0) &  # G=0
         (final[:, :, 2] == 255)))  # B=255
    final[mask] = np.array([255, 255, 255])  # set Final to white where mask is True
else:
    final = copy.deepcopy(shadows)

The RGB-values can of course be replaced with a lookup to predefined values like with your colours dict. But I would proposse using an array to store colours, especially if you plan to index it with numbers:

colours = np.array([[255, 255, 255], [0, 0, 255]])

so that the mask will look like:

mask = (  # create a mask where the condition is True
    ((shadows[:, :, 0] == colours[0, 0]) &  # R=255
     (shadows[:, :, 1] == colours[0, 1]) &  # G=255
     (shadows[:, :, 2] == colours[0, 2])) &  # B=255
    ((final[:, :, 0] == colours[1, 0]) &  # R=0
     (final[:, :, 1] == colours[1, 1]) &  # G=0
     (final[:, :, 2] == colours[1, 2])))  # B=255
final[mask] = colours[0]  # set Final to white where mask is True

Of course this also works using a dict.

To speed this up a little further, you can replace the RGC-comparison in masking with some comparison with the array itself (a stencil computation). This is about 5% faster for your array size, with the speed difference increasing with increasing array size, but you lose the flexibility of comparing other colours by just changing the entries in the colours array/dict. The mask with stencil operations looks like:

mask = (  # create a mask where the condition is True
    ((shadows[:, :, 0] == shadows[:, :, 1]) &  # R=G
     (shadows[:, :, 1] == shadows[:, :, 2]) &  # G=B
     (shadows[:, :, 2] == colours[0, 2])) &  # R=G=B=255
    ((final[:, :, 0] == final[:, :, 1]) &  # R=G
     (final[:, :, 1] == colours[1, 1]) &  # G=0
     (final[:, :, 2] == colours[1, 2])))  # B=255

This should help speeding up your computation substantially.

Parts of the other code can also be optimized. But of course this is only worth it, if this is not the bottleneck. Just one example: Instead of calling random.randint each loop, you could call it one time and create a random array (and also the +- DotRadius arrays) and then loop over this array:

ypos = np.random.randint(DotRadius, h-DotRadius, size=NumberOfDots)
ypos_plus_dot_radius = ypos + DotRadius
ypos_minus_dot_radius = ypos - DotRadius
xpos = np.random.randint(DotRadius, w-DotRadius, size=NumberOfDots)
xpos_plus_dot_radius = xpos + DotRadius
xpos_minus_dot_radius = xpos - DotRadius
for dot in range(NumberOfDots):
    yrange = np.arange(ypos_minus_dot_radius[dot], ypos_plus_dot_radius[dot])  # make range instead of looping
    # looping over xrange imho can't be avoided without further matrix operations
    for i in range(xpos_minus_dot_radius[dot], xpos_plus_dot_radius[dot]):
        # make a mask for the y-positions where the condition is true and
        # index the y-axis of Shadows with this mask:
        Shadows[yrange[np.sqrt((xpos[dot] - i)**2 + (ypos[dot] - yrange)**2) < DotRadius], i] = colours[1]
        # colours[1] can of course be replaced with any 3-element array or single integer/float

A massive thank you for this, and the very detailed explanation :-). It has made a massive improvement in my codes execution time. This bottle neck has basically totally disappeared! A rough order of magnitude of the improvement is, like you said around the x1000 mark. This section of code was taking around 100 seconds to run, and now its about 0.1 sec! So a massive thank you :-)
You are welcome and thanks for accepting the answer! For more information on coding-style you should take a look at PEP-8 style guide. It may seem not useful in the beginning, but once you have more than 1000 lines of code with classes and multiple files, it is absolutely helpful to follow it. If you need more help on masking etc., just ask.

Collectives™ on Stack Overflow

NumPy 2D array iteration speed

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related