Skip to main content
2 of 4
added 432 characters in body
coderodde
  • 32.1k
  • 15
  • 78
  • 205

Since $$\sqrt{D_a} \sqrt{D_b} = \sqrt{D_a D_b}$$ you can get rid of one call to sqrt

double cosine_similarity2(double *A, double *B, unsigned int size)
{
    double mul = 0.0, d_a = 0.0, d_b = 0.0 ;

    for(unsigned int i = 0; i < size; ++i)
    {
        mul += A[i] * B[i] ;
        d_a += A[i] * A[i] ;
        d_b += B[i] * B[i] ;
    }

    if (d_a == 0 || d_b == 0)
    {
        throw runtime_error(
                "cosine similarity is not defined whenever one or both "
                "input vectors are zero-vectors.");
    }

    return mul / (sqrt(d_a * d_b)) ;
}

Minor

Usually people don't put a single space before the semicolon ;. However, usually a single space is put straight after for:

for (int i = 0; ..., ++i) 
{  ^
    ...
}

Divide by zero

Note that sqrt(d_a * d_b) == 0 only when at least one of the input vectors is a zero vector. However, cosine similarity assumes that the two input vectors are not zero-vectors, so it makes sense to me to throw an exception when it happens.

coderodde
  • 32.1k
  • 15
  • 78
  • 205