2

I am trying to create a 3d-array, whose cell entries are to be calculated from the cell indices. Specifically, I want that cell (i,j,k) = sqrt(i+j+k).

This is very easily done with the following for loops:

N=10
A=np.zeros((N,N,N))

for i in range(N):
    for j in range(N):
        for k in range(N):
            A[i][j][k] = np.sqrt(i+j+k)

I was wondering if numpy has inbuilt functions that make these nested for loops redundant.

3 Answers 3

7

Simplest and performant one would be with open grids using np.ogrid and then performing the relevant operation(s) -

I,J,K = np.ogrid[:N,:N,:N]
A = np.sqrt(I+J+K)

Or with np.sum for the broadcasted summations of open grids for a one-liner -

A = np.sqrt(np.sum(np.ogrid[:N,:N,:N]))

Relevant : General workflow on vectorizing loops involving range iterators.

Sign up to request clarification or add additional context in comments.

3 Comments

Thats great! I'll accept your answer in a couple of hours...maybe someone else has an even more pythonesque way. I doubt that though
I didn't know np.sum can do that. (summing something that doesn't coerce to a single array). Seems to be a bit slower (by a tiny constant overhead) than the builtin sum, though.
@DouglasJamesBock I do not think there is someone can provide the better answer , At least I can not . PS: I learn the numpy from Divakar
3

you can use np.arange and then np.newaxis to create the different dimensions. With a simple sum and np.sqrt to do the job after:

arr = np.arange(N)
A = np.sqrt(arr + arr[:,np.newaxis]+ arr[:,np.newaxis,np.newaxis])

You get the same result:

N = 10
arr = np.arange(N)
A = np.sqrt(arr + arr[:,np.newaxis]+ arr[:,np.newaxis,np.newaxis])
B = np.sqrt(np.sum(np.ogrid[:N,:N,:N]))
print ((A==B).all())
#True

This method is a bit faster than using np.ogrid:

N = 10
%timeit arr = np.arange(N); A = np.sqrt(arr + arr[:,np.newaxis]+ arr[:,np.newaxis,np.newaxis])
#18.6 µs ± 3.29 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit A = np.sqrt(np.sum(np.ogrid[:N,:N,:N]))
#58.5 µs ± 8.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

3 Comments

That's cheeky - Using the fact that all iterations are at N :) You are gaining the performance by avoiding the range creations for all three iterators.
@Divakar Thanks :) actually, even if you don't create arr but do directly the np.arange(N) each time, it is still faster than ogrid. So I guess if you have a different length in each dimension, it would still be faster. Maybe if you increase N it won't be the case, I did not try higher N
Ben.T, @Divakar Many of the numpy convenience functions have terrible overheads...
2

This is faster for large N but may be considered cheating ;-)

It takes full advantage of the highly regular and repetitive pattern to save lots of evaluations of the square root.

def cheat(N):
    values = np.sqrt(np.arange(3*N-2))
    result = np.lib.stride_tricks.as_strided(values, (N, N, N), 3*values.strides)
    return np.ascontiguousarray(result)

If you can live with a non-contiguous, read-only (by all practical means) view this can be substantially faster:

def cheat_nc_view(N):
    values = np.sqrt(np.arange(3*N-2))
    return np.lib.stride_tricks.as_strided(values, (N, N, N), 3*values.strides)

For reference:

def cheek(N):
    arr = np.arange(N)
    return np.sqrt(arr + arr[:,np.newaxis] + arr[:,np.newaxis,np.newaxis])

>>> np.all(cheek(20) == cheat(20))
True
>>> np.all(cheek(200) == cheat_nc_view(200))
True

Timings:

>>> timeit(lambda: cheek(20), number=1000)
0.05387042500660755
>>> timeit(lambda: cheat(20), number=1000)
0.020798540994292125
>>> timeit(lambda: cheat_nc_view(20), number=1000)
0.010791150998556986

>>> timeit(lambda: cheek(200), number=100)
6.823299437994137
>>> timeit(lambda: cheat(200), number=100)
2.0583883369981777
>>> timeit(lambda: cheat_nc_view(200), number=100)
0.0014881940151099116

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.