Revisions to Faster solution for row-wise matrix subtraction

Made stricter

Source Link

edited Jan 12, 2015 at 19:41

9.8k
23
38

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]np.newaxis]), axis=-1)

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)

added 6 characters in body

Source Link

edited Jan 12, 2015 at 11:06

Veedrac

9.8k
23
38

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., None]np.newaxis]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., None]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))

Source Link

answered Jan 11, 2015 at 22:01

Veedrac

9.8k
23
38

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., None]))
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and that \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., None]))

Stack Exchange Network

Return to Answer