Speeding up a numpy loop in python?

Question

Consider the following code using numpy arrays which is very slow :

# Intersection of an octree and a trajectory
def intersection(octree, trajectory):
    # Initialize numpy arrays
    ox = octree.get("x")
    oy = octree.get("y")
    oz = octree.get("z")
    oe = octree.get("extent")/2
    tx = trajectory.get("x")
    ty = trajectory.get("y")
    tz = trajectory.get("z")
    result = np.zeros(np.size(ox))
    # Loop over elements
    for i in range(0, np.size(tx)):
        for j in range(0, np.size(ox)):
            if (tx[i] > ox[j]-oe[j] and 
                tx[i] < ox[j]+oe[j] and 
                ty[i] > oy[j]-oe[j] and 
                ty[i] < oy[j]+oe[j] and 
                tz[i] > oz[j]-oe[j] and 
                tz[i] < oz[j]+oe[j]):
                result[j] += 1
    # Finalize
    return result

How to rewrite the function to speed up the calculation ? (np.size(tx) == 10000 and np.size(ox) == 100000)

Build a scipy.spatial.KDTree from the points tx, ty, tz and then use nearest-neighbour look-up in the infinity norm for each point in ox, oy, oz to see whether there is any point close enough. — Sven Marnach
– Sven Marnach, Commented Jun 5, 2014 at 23:07
Have you considered using Cython? I have read that it gives large speedups without much pain. stackoverflow.com/questions/7799977/numpy-vs-cython-speed — steveha
– steveha, Commented Jun 5, 2014 at 23:14

Oleg Sklyar · Accepted Answer · 2014-06-05 23:28:26Z

6

You are allocating 10000 lists of size 100000. The first thing to do would be to stop using range for the nested j loop and use the generator version xrange instead. This will save you time and space allocating all those lists.

The next one would be to use vectorized operations:

for i in xrange(0, np.size(tx)):
    index = (ox-oe < tx[i]) & (ox+oe > tx[i]) & (oy-oe < ty[i]) & (oy+oe > ty[i]) & (oz-oe < tz[i]) & (oz+oe > tz[i])
    result[index] += 1

edited Jun 5, 2014 at 23:28

answered Jun 5, 2014 at 23:18

Oleg Sklyar

10.2k6 gold badges43 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Oleg Sklyar Over a year ago

Give me a sec, there is the next one coming :)

Oleg Sklyar Over a year ago

Ok, the vectorized form is in (may need checking of conditions), but as you modified your question in response to my original answer that part of the answer became irrelevant even though it would still be a good initial step to do in case of non-bumpy data :)

Alex Gaynor · Accepted Answer · 2014-06-05 23:26:35Z

0

You're likely to get good results by running this code under PyPy: http://pypy.org/ (instructions for our NumPy integration at https://bitbucket.org/pypy/numpy)

answered Jun 5, 2014 at 23:26

Alex Gaynor

15.1k10 gold badges65 silver badges113 bronze badges

Comments

tom10 · Accepted Answer · 2014-06-06 04:45:35Z

I think this will give the same result for the double loop and be faster:

for j in xrange(np.size(ox)):
    result[j] += sum( abs(tx-ox[j])<oe[j] & abs(ty-oy[j])<oe[j] & abs(tz-oz[j])<oe[j] )

To get this: 1) reorder the loops (ie, swap them) which is valid since nothing changes within the loops; 2) pull result[j] outside the i loop; 3) convert all the t>ox-oe and t<ox+oe to abs(t-ox)<oe (though this may not be a huge speedup, it's easier to read).

Since you don't have runnable code, and I didn't want to build a test for this, I'm not 100% sure this is correct.

Collectives™ on Stack Overflow

Speeding up a numpy loop in python?

3 Answers 3

2 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Linked

Related