4

I want to generate a 2D numpy array with elements calculated from their positions. Something like the following code:

import numpy as np

def calculate_element(i, j, other_parameters):
    # do something
    return value_at_i_j

def main():
    arr = np.zeros((M, N))  # (M, N) is the shape of the array
    for i in range(M):
        for j in range(N):
            arr[i][j] = calculate_element(i, j, ...)

This code runs extremely slow since the loops in Python are just not very efficient. Is there any way to do this faster in this case?

By the way, for now I use a workaround by calculating two 2D "index matrices". Something like this:

def main():
    index_matrix_i = np.array([range(M)] * N).T
    index_matrix_j = np.array([range(N)] * M)

    '''
    index_matrix_i is like
    [[0,0,0,...],
     [1,1,1,...],
     [2,2,2,...],
     ...
    ]

    index_matrix_j is like
    [[0,1,2,...],
     [0,1,2,...],
     [0,1,2,...],
     ...
    ]
    '''

    arr = calculate_element(index_matrix_i, index_matrix_j, ...)

Edit1: The code becomes much faster after I apply the "index matrices" trick, so the main question I want to ask is that if there is a way to not use this trick, since it takes more memory. In short, I want to have a solution that is efficient in both time and space.

Edit2: Some examples I tested

# a simple 2D Gaussian
def calculate_element(i, j, i_mid, j_mid, i_sig, j_sig):
    gaus_i = np.exp(-((i - i_mid)**2) / (2 * i_sig**2))
    gaus_j = np.exp(-((j - j_mid)**2) / (2 * j_sig**2))
    return gaus_i * gaus_j
# size of M, N
M, N = 1200, 4000
# use for loops to go through every element
# this code takes ~10 seconds
def main_1():
    arr = np.zeros((M, N))  # (M, N) is the shape of the array
    for i in range(M):
        for j in range(N):
            arr[i][j] = calculate_element(i, j, 600, 2000, 300, 500)
    # print(arr)
    plt.figure(figsize=(8, 5))
    plt.imshow(arr, aspect='auto', origin='lower')
    plt.show()
# use index matrices
# this code takes <1 second
def main_2():
    index_matrix_i = np.array([range(M)] * N).T
    index_matrix_j = np.array([range(N)] * M)
    arr = calculate_element(index_matrix_i, index_matrix_j, 600, 2000, 300, 500)

    # print(arr)
    plt.figure(figsize=(8, 5))
    plt.imshow(arr, aspect='auto', origin='lower')
    plt.show()
4
  • What is the body of calculate_element? Regardless of how you assign things, you're still calling this function M*N times. If you can cache intermediate results of this function you could speed up the loops Commented Apr 22, 2022 at 18:48
  • There is a significant speed boost after I use the "index matrices" trick, so I think that it is the loops that take too much time, not the function. Commented Apr 22, 2022 at 18:55
  • 1
    Please add a minimal reproducible example. This looks trivially vectorizable (with a space overhead of M * N * np.intp), easily avoidable with numba. But without the body of calculate_element and a size estimate of N,M it's just a guess. Commented Apr 22, 2022 at 19:06
  • The fast compiled numpy methods work with or produce whole arrays. As such they can take up a lot of memory, even if it is in temporary buffers. It's easy to estimate memory usage - from the shape and dtype. Iterations can avoid those large temporary arrays, though it won't change the size of the final result. But you loose speed - unless you implement the loops in compilation tool like cython or numba. Commented Apr 22, 2022 at 20:03

3 Answers 3

4

You can use np.indices() to generate the desired output:

For example,

np.indices((3, 4))

outputs:

[[[0 0 0 0]
  [1 1 1 1]
  [2 2 2 2]]

 [[0 1 2 3]
  [0 1 2 3]
  [0 1 2 3]]]
Sign up to request clarification or add additional context in comments.

3 Comments

Wow! That is an efficient way to generate the index matrices! But is there any way to reduce the memory usage?
I found another way to do this: index_matrix_i, index_matrix_j = np.meshgrid(range(M), range(N), indexing='ij'). The speed is comparible to the method from @BrokenBenchmark, and much faster than the original index matrices generation.
meshgrid works too. By the way, remember to upvote useful answers and accept the one you find most helpful, if you haven't already.
2

Vectorized is faster than trivially jitted on my 2-core machine

import numpy as np
import matplotlib.pyplot as plt

M, N = 1200, 4000
i = np.arange(M)
j = np.arange(N)
i_mid, j_mid, i_sig, j_sig = 600, 2000, 300, 500

arr = np.exp(-(i - i_mid)**2 / (2 * i_sig**2))[:,None] * np.exp(-(j - j_mid)**2 / (2 * j_sig**2))
# %timeit 100 loops, best of 5: 8.82 ms per loop

plt.figure(figsize=(8, 5))
plt.imshow(arr, aspect='auto', origin='lower')
plt.show()

numpy result

Jitted parallel numba

import numba as nb  # tested with numba 0.55.1

@nb.njit(parallel=True)
def calculate_element_nb(i, j, i_mid, j_mid, i_sig, j_sig):
    res = np.empty((i,j), np.float32)
    for i in nb.prange(res.shape[0]):
        for j in range(res.shape[1]):
            res[i,j] = np.exp(-(i - i_mid)**2 / (2 * i_sig**2)) * np.exp(-(j - j_mid)**2 / (2 * j_sig**2))
    return res

M, N = 1200, 4000

calculate_element_nb(M, N, 600, 2000, 300, 500)
# %timeit 10 loops, best of 5: 80.4 ms per loop

plt.figure(figsize=(8, 5))
plt.imshow(calculate_element_nb(M, N, 600, 2000, 300, 500), aspect='auto', origin='lower')
plt.show()

numba result

Comments

1

You can use a single loop to fill a multidimensional list which after completing all its elements it will be converted to np.array like this:

import numpy as np

m, n = 5, 5
arr = []
for i in range(0, m*n, n):
    arr.append(list(range(i, i+n)))
print(np.array(arr))

Output:

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.