Generating indices of a 2D NumPy array

Question

I want to generate a 2D numpy array with elements calculated from their positions. Something like the following code:

import numpy as np

def calculate_element(i, j, other_parameters):
    # do something
    return value_at_i_j

def main():
    arr = np.zeros((M, N))  # (M, N) is the shape of the array
    for i in range(M):
        for j in range(N):
            arr[i][j] = calculate_element(i, j, ...)

This code runs extremely slow since the loops in Python are just not very efficient. Is there any way to do this faster in this case?

By the way, for now I use a workaround by calculating two 2D "index matrices". Something like this:

def main():
    index_matrix_i = np.array([range(M)] * N).T
    index_matrix_j = np.array([range(N)] * M)

    '''
    index_matrix_i is like
    [[0,0,0,...],
     [1,1,1,...],
     [2,2,2,...],
     ...
    ]

    index_matrix_j is like
    [[0,1,2,...],
     [0,1,2,...],
     [0,1,2,...],
     ...
    ]
    '''

    arr = calculate_element(index_matrix_i, index_matrix_j, ...)

Edit1: The code becomes much faster after I apply the "index matrices" trick, so the main question I want to ask is that if there is a way to not use this trick, since it takes more memory. In short, I want to have a solution that is efficient in both time and space.

Edit2: Some examples I tested

# a simple 2D Gaussian
def calculate_element(i, j, i_mid, j_mid, i_sig, j_sig):
    gaus_i = np.exp(-((i - i_mid)**2) / (2 * i_sig**2))
    gaus_j = np.exp(-((j - j_mid)**2) / (2 * j_sig**2))
    return gaus_i * gaus_j

# size of M, N
M, N = 1200, 4000

# use for loops to go through every element
# this code takes ~10 seconds
def main_1():
    arr = np.zeros((M, N))  # (M, N) is the shape of the array
    for i in range(M):
        for j in range(N):
            arr[i][j] = calculate_element(i, j, 600, 2000, 300, 500)
    # print(arr)
    plt.figure(figsize=(8, 5))
    plt.imshow(arr, aspect='auto', origin='lower')
    plt.show()

# use index matrices
# this code takes <1 second
def main_2():
    index_matrix_i = np.array([range(M)] * N).T
    index_matrix_j = np.array([range(N)] * M)
    arr = calculate_element(index_matrix_i, index_matrix_j, 600, 2000, 300, 500)

    # print(arr)
    plt.figure(figsize=(8, 5))
    plt.imshow(arr, aspect='auto', origin='lower')
    plt.show()

What is the body of calculate_element? Regardless of how you assign things, you're still calling this function M*N times. If you can cache intermediate results of this function you could speed up the loops — Jeremy
– Jeremy, Commented Apr 22, 2022 at 18:48
There is a significant speed boost after I use the "index matrices" trick, so I think that it is the loops that take too much time, not the function. — Alex Chen
– Alex Chen, Commented Apr 22, 2022 at 18:55
Please add a minimal reproducible example. This looks trivially vectorizable (with a space overhead of M * N * np.intp), easily avoidable with numba. But without the body of calculate_element and a size estimate of N,M it's just a guess. — Michael Szczesny
– Michael Szczesny, Commented Apr 22, 2022 at 19:06
The fast compiled numpy methods work with or produce whole arrays. As such they can take up a lot of memory, even if it is in temporary buffers. It's easy to estimate memory usage - from the shape and dtype. Iterations can avoid those large temporary arrays, though it won't change the size of the final result. But you loose speed - unless you implement the loops in compilation tool like cython or numba. — hpaulj
– hpaulj, Commented Apr 22, 2022 at 20:03

BrokenBenchmark · Accepted Answer · 2022-04-22 18:41:16Z

4

You can use np.indices() to generate the desired output:

For example,

np.indices((3, 4))

outputs:

[[[0 0 0 0]
  [1 1 1 1]
  [2 2 2 2]]

 [[0 1 2 3]
  [0 1 2 3]
  [0 1 2 3]]]

answered Apr 22, 2022 at 18:41

BrokenBenchmark

19.4k7 gold badges26 silver badges40 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Alex Chen Over a year ago

Wow! That is an efficient way to generate the index matrices! But is there any way to reduce the memory usage?

Alex Chen Over a year ago

I found another way to do this: index_matrix_i, index_matrix_j = np.meshgrid(range(M), range(N), indexing='ij'). The speed is comparible to the method from @BrokenBenchmark, and much faster than the original index matrices generation.

BrokenBenchmark Over a year ago

meshgrid works too. By the way, remember to upvote useful answers and accept the one you find most helpful, if you haven't already.

3 revs · Accepted Answer · 2022-04-23 11:51:27Z

Vectorized numpy is faster than trivially jitted numba on my 2-core machine

import numpy as np
import matplotlib.pyplot as plt

M, N = 1200, 4000
i = np.arange(M)
j = np.arange(N)
i_mid, j_mid, i_sig, j_sig = 600, 2000, 300, 500

arr = np.exp(-(i - i_mid)**2 / (2 * i_sig**2))[:,None] * np.exp(-(j - j_mid)**2 / (2 * j_sig**2))
# %timeit 100 loops, best of 5: 8.82 ms per loop

plt.figure(figsize=(8, 5))
plt.imshow(arr, aspect='auto', origin='lower')
plt.show()

Jitted parallel numba

import numba as nb  # tested with numba 0.55.1

@nb.njit(parallel=True)
def calculate_element_nb(i, j, i_mid, j_mid, i_sig, j_sig):
    res = np.empty((i,j), np.float32)
    for i in nb.prange(res.shape[0]):
        for j in range(res.shape[1]):
            res[i,j] = np.exp(-(i - i_mid)**2 / (2 * i_sig**2)) * np.exp(-(j - j_mid)**2 / (2 * j_sig**2))
    return res

M, N = 1200, 4000

calculate_element_nb(M, N, 600, 2000, 300, 500)
# %timeit 10 loops, best of 5: 80.4 ms per loop

plt.figure(figsize=(8, 5))
plt.imshow(calculate_element_nb(M, N, 600, 2000, 300, 500), aspect='auto', origin='lower')
plt.show()

Cardstdani · Accepted Answer · 2022-04-22 18:40:26Z

1

You can use a single loop to fill a multidimensional list which after completing all its elements it will be converted to np.array like this:

import numpy as np

m, n = 5, 5
arr = []
for i in range(0, m*n, n):
    arr.append(list(range(i, i+n)))
print(np.array(arr))

Output:

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]

answered Apr 22, 2022 at 18:40

Cardstdani

5,2533 gold badges14 silver badges37 bronze badges

Collectives™ on Stack Overflow

Generating indices of a 2D NumPy array

3 Answers 3

3 Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Related