Skip to main content
Made stricter
Source Link
Veedrac
  • 9.8k
  • 23
  • 38

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]np.newaxis]), axis=-1)

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
added 6 characters in body
Source Link
Veedrac
  • 9.8k
  • 23
  • 38

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., None]np.newaxis]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., None]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))
Source Link
Veedrac
  • 9.8k
  • 23
  • 38

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., None]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))