edited tags

Link

edited Nov 13, 2014 at 17:16

200_success

145.6k
22
191
481

edited title

Link

edited Nov 13, 2014 at 11:34

alberto

235
2
9

Which is the fastest way to compute Fastest computation of N likelihoods on normal distributions?

edited title

Source Link

edited Nov 13, 2014 at 11:29

alberto

235
2
9

Which is the fastest way to compute N likelihoods on normal distributions in Python / CPython?

In the context of a Gibbs sampler, I profiled my code and my major bottleneck is the following:

I need to compute the likelihood of N points assuming they have been drawn from N normal distributions (with different means but same variance).

Here are two ways to compute it:

import numpy as np
from scipy.stats import multivariate_normal
from scipy.stats import norm

# Toy data
y = np.random.uniform(low=-1, high=1, size=100) # data points
loc = np.zeros(len(y)) # means

# Two alternatives
%timeit multivariate_normal.logpdf(y, mean=loc, cov=1)
%timeit sum(norm.logpdf(y, loc=loc, scale=1))

The first: use the recently implemented multivariate_normal of scipy building a. Build the equivalent N-dimensional gaussian and computingcompute the probability(log)probability of a N-dimensional y.

1000 loops, best of 3: 1.33 ms per loop
The second: computeuse the traditional norm function of scipy. Compute the individual loglikelihoods(log)probability of every point y and then sum the results.

10000 loops, best of 3: 130 µs per loop

Since this is part of a Gibbs sampler, I need to repeat this computation around a 10.000 times, and therefore I need it to be as fast as possible.

How can I improve it?

(fromeither from python or calling Cython, R or whatever)

Which is the fastest way to compute N likelihoods on normal distributions in Python / CPython?

In the context of a Gibbs sampler, I profiled my code and my major bottleneck is the following:

I need to compute the likelihood of N points assuming they have been drawn from N normal distributions (with different means but same variance).

Here are two ways to compute it:

import numpy as np
from scipy.stats import multivariate_normal
from scipy.stats import norm

# Toy data
y = np.random.uniform(low=-1, high=1, size=100) # data points
loc = np.zeros(len(y)) # means

# Two alternatives
%timeit multivariate_normal.logpdf(y, mean=loc, cov=1)
%timeit sum(norm.logpdf(y, loc=loc, scale=1))

The first: use the recently implemented multivariate_normal of scipy building a N-dimensional gaussian and computing the probability of a N-dimensional y.

1000 loops, best of 3: 1.33 ms per loop
The second: compute the individual loglikelihoods of every point y and then sum the results.

10000 loops, best of 3: 130 µs per loop

Since this is part of a Gibbs sampler, I need to repeat this computation around a 10.000 times, I need it to be as fast as possible.

How can I improve it? (from python or calling Cython, R or whatever)

Which is the fastest way to compute N likelihoods on normal distributions?

In the context of a Gibbs sampler, I profiled my code and my major bottleneck is the following:

I need to compute the likelihood of N points assuming they have been drawn from N normal distributions (with different means but same variance).

Here are two ways to compute it:

import numpy as np
from scipy.stats import multivariate_normal
from scipy.stats import norm

# Toy data
y = np.random.uniform(low=-1, high=1, size=100) # data points
loc = np.zeros(len(y)) # means

# Two alternatives
%timeit multivariate_normal.logpdf(y, mean=loc, cov=1)
%timeit sum(norm.logpdf(y, loc=loc, scale=1))

The first: use the recently implemented multivariate_normal of scipy. Build the equivalent N-dimensional gaussian and compute the (log)probability of a N-dimensional y.

1000 loops, best of 3: 1.33 ms per loop
The second: use the traditional norm function of scipy. Compute the individual (log)probability of every point y and then sum the results.

10000 loops, best of 3: 130 µs per loop

Since this is part of a Gibbs sampler, I need to repeat this computation around 10.000 times, and therefore I need it to be as fast as possible.

How can I improve it?

(either from python or calling Cython, R or whatever)

Source Link

asked Nov 13, 2014 at 11:22

alberto

235
2
9

Loading

Stack Exchange Network

Return to Question

Which is the fastest way to compute Fastest computation of N likelihoods on normal distributions?

Which is the fastest way to compute N likelihoods on normal distributions in Python / CPython?

Which is the fastest way to compute N likelihoods on normal distributions in Python / CPython?

Which is the fastest way to compute N likelihoods on normal distributions?