How can I improve the efficiency of this numpy loop

Question

I've got a numpy array containing labels. I'd like to get calculate a number for each label based on its size and bounding box. How can I write this more efficiently so that it's realistic to use on large arrays (~15000 labels)?

A = array([[ 1, 1, 0, 3, 3],
           [ 1, 1, 0, 0, 0],
           [ 1, 0, 0, 2, 2],
           [ 1, 0, 2, 2, 2]] )

B = zeros( 4 )

for label in range(1, 4):
    # get the bounding box of the label
    label_points = argwhere( A == label )
    (y0, x0), (y1, x1) = label_points.min(0), label_points.max(0) + 1

    # assume I've computed the size of each label in a numpy array size_A
    B[ label ] = myfunc(y0, x0, y1, x1, size_A[label])

Did you do some profiling to see which of your statements is the one slowing you down? Maybe it's the function myfunc which could probably be parallized by saving y0, x0, y1, x1 in separate arrays getting out of the loop and only calling the function once. Otherwise, if speed really counts, you may want to look into whether it's worth doing some C code. I found cython to be really comfortable when working with numpy arrays. — David Zwicker
– David Zwicker, Commented Nov 23, 2011 at 16:46

Sven Marnach · Accepted Answer · 2011-11-23 17:25:13Z

I wasn't really able to implement this efficiently using some NumPy vectorised functions, so maybe a clever Python implementation will be faster.

def first_row(a, labels):
    d = {}
    d_setdefault = d.setdefault
    len_ = len
    num_labels = len_(labels)
    for i, row in enumerate(a):
        for label in row:
            d_setdefault(label, i)
        if len_(d) == num_labels:
            break
    return d

This function returns a dictionary mapping each label to the index of the first row it appears in. Applying the function to A, A.T, A[::-1] and A.T[::-1] also gives you the first column as well as the last row and column.

If you would rather like a list instead of a dictionary, you can turn the dictionary into a list using map(d.get, labels). Alternatively, you can use a NumPy array instead of a dictionary right from the start, but you will lose the ability to leave the loop early as soon as all labels were found.

I'd be interested whether (and how much) this actually speeds up your code, but I'm confident that it is faster than your original solution.

It's not the prettiest way to go, but it definitely works. My original way ran so long that I never even let it finish (gave up after 20 minutes). I just ran your method and got it in 6m30s.
@ajwood: Thanks for the feedback. I know it's not pretty, but the best easy solution I could think of. If you want to do it even faster, I'd suggest to implement it in Cython.

HYRY · Accepted Answer · 2011-11-24 02:25:57Z

Another method:

use bincount() to get labels count in every row and column, and save the information in rows and cols array.

For each label you only need to search the range in rows and columns. It's faster than sort, on my pc, it can do the calculation in a few seconds.

def label_range2(A):
    maxlabel = np.max(A)+1
    h, w = A.shape
    rows = np.zeros((h, maxlabel), np.bool)
    for row in xrange(h):
        rows[row,:] = np.bincount(A[row,:], minlength=maxlabel) > 0

    cols = np.zeros((w, maxlabel), np.bool)
    for col in xrange(w):
        cols[col,:] =np.bincount(A[:,col], minlength=maxlabel) > 0

    for label in xrange(1, maxlabel):
        row = rows[:, label]
        col = cols[:, label]
        y = np.where(row)[0]
        x = np.where(col)[0]
        x0 = np.min(x)
        x1 = np.max(x)+1
        y0 = np.min(y)
        y1 = np.max(y)+1        
        yield label, x0,y0,x1,y1

Sven Marnach · Accepted Answer · 2011-11-24 12:51:40Z

Algorithm:

change the array to one dimension
get the sort index by argsort()
get the sorted version of on dimension array as sorted_A
use where() and diff() to find label change position in sorted_A
use the change position and the sort index to get the original position of the label in one dimension.
calculate two dimension location from the on dimension position.

for large array such as (7000, 9000), is can finished the calculation in 30s.

here is the code:

import numpy as np

A = np.array([[ 1, 1, 0, 3, 3],
           [ 1, 1, 0, 0, 0],
           [ 1, 0, 0, 2, 2],
           [ 1, 0, 2, 2, 2]] )

def label_range(A):
    from itertools import izip_longest
    h, w = A.shape
    tmp = A.reshape(-1)

    index = np.argsort(tmp)
    sorted_A = tmp[index]
    pos = np.where(np.diff(sorted_A))[0]+1
    for p1,p2 in izip_longest(pos,pos[1:]):
        label_index = index[p1:p2]
        y = label_index // w
        x = label_index % w

        x0 = np.min(x)
        x1 = np.max(x)+1
        y0 = np.min(y)
        y1 = np.max(y)+1
        label = tmp[label_index[0]]

        yield label,x0,y0,x1,y1

for label,x0,y0,x1,y1 in label_range(A):
    print "%d:(%d,%d)-(%d,%d)" % (label, x0,y0,x1,y1)

#B = np.random.randint(0, 100, (7000, 9000))
#list(label_range(B))

I accidentally downvoted you post because I thought the algorithm was wrong. I had to do a dummy edit to unlock the vote -- upvoted instead. :)

silvado · Accepted Answer · 2011-11-23 20:00:22Z

The performace bottleneck seems indeed to be the call to argmax. It can be avoided by changing the loop as follows (only computing y0, y1, but easy to generalize to x0, x1):

for label in range(1, 4):
    comp = (A == label)
    yminind = comp.argmax(0)
    ymin = comp.max(0)
    ymaxind = comp.shape[0] - comp[::-1].argmax(0)
    y0 = yminind[ymin].min()
    y1 = ymaxind[ymin].max()

I'm not sure about the reason for the performance difference, but one reason might be that all operations like ==, argmax, and max can preallocate their output array directly from the shape of the input array, which is not possible for argwhere.

fijal · Accepted Answer · 2011-11-28 11:58:58Z

1

Using PyPy you can just run the loop and not worry about the vectorization. It should be fast.

answered Nov 28, 2011 at 11:58

fijal

3,20019 silver badges21 bronze badges

Collectives™ on Stack Overflow

How can I improve the efficiency of this numpy loop

5 Answers 5

2 Comments

1 Comment

1 Comment

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

1 Comment

1 Comment

Comments

Comments

Related