1

Suppose I have two arrays A and B with dimensions (n1,m1,m2) and (n2,m1,m2), respectively. I want to compute the matrix C with dimensions (n1,n2) such that C[i,j] = sum((A[i,:,:] - B[j,:,:])^2). Here is what I have so far:

import numpy as np
A = np.array(range(1,13)).reshape(3,2,2)
B = np.array(range(1,9)).reshape(2,2,2)
C = np.zeros(shape=(A.shape[0], B.shape[0]) )
for i in range(A.shape[0]):
    for j in range(B.shape[0]):
        C[i,j] = np.sum(np.square(A[i,:,:] - B[j,:,:]))
C

What is the most efficient way to do this? In R I would use a vectorized approach, such as outer. Is there a similar method for Python?

Thanks.

1 Answer 1

3

You can use scipy's cdist, which is pretty efficient for such calculations after reshaping the input arrays to 2D, like so -

from scipy.spatial.distance import cdist

C = cdist(A.reshape(A.shape[0],-1),B.reshape(B.shape[0],-1),'sqeuclidean')

Now, the above approach must be memory efficient and thus a better one when working with large datasizes. For small input arrays, one can also use np.einsum and leverage NumPy broadcasting, like so -

diffs = A[:,None]-B
C = np.einsum('ijkl,ijkl->ij',diffs,diffs)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.