17

I needed to compute Q^N for quite a lot of various values of N (between 1 to 10000) and Numpy was a bit too slow.

I've asked on math.stackexchange.com if I could avoid to compute Q^N for my specific need and someone answered me that computing Q^N should be quite fast using the P D^N P^-1 method.

So basically, instead of doing:

import numpy as np
from numpy import linalg as LA
...
LA.matrix_power(m, N)

I've tried :

diag, P = LA.eig(m)
DN = np.diag(diag**N)
P1 = LA.inv(P)

P*DN*P1

And I obtain the same matrix as result (tried on a single example)

On a more complex matrix, Q:

% timeit.Timer('Q**10000', setup=setup).repeat(2, 100)
[5.87254786491394, 5.863131046295166]

% timeit.Timer('diag, P = linalg.eig(Q); DN=np.diag(diag**10000);P1=linalg.inv(P); P*DN*P1', setup=setup).repeat(2, 100)
[2.0032401084899902, 2.018735885620117]

And regarding my initial problem, the second method allows me to compute P, diag and P1only once and use it thousands of times. It's 8 times faster using this method.

My questions are:

  • In which case it is not possible to use this last method to compute Q^N?
  • Is it fine to use it in my case (matrix Q as given here)?
  • Is there in numpy a function that already does it?

Edit:

  • It appears that for another matrix, P is not invertible. So I am adding a new question: how can I change the matrix P so it becomes invertible but the resulting matrix is not too altered? I mean, it's ok if the values are close to the real result, by close I mean ~0.0001.
7
  • Regarding my 2 first questions, I guess Q must not be defective. But I don't know if my matrices are defective or not (because of my math background too far). Commented Sep 20, 2013 at 15:16
  • 1
    You could further speed this up by doing diag**10000 using the exponentiation by squaring method. See my answer to another question where I implement it in numpy. Commented Sep 20, 2013 at 15:18
  • @Claudiu wow. I naively thought that diag**10000 would use the squaring method! However, in my case, I might even use the possibility to pass floating numbers anyway. That also something I could not use with LA.matrix_power. Commented Sep 20, 2013 at 15:22
  • 1
    You can do this as long as the matrix is diagonalizable. If the matrix Q is real and symmetric, you will always be able to perform this. Bonne chance, Maxime. Commented Sep 20, 2013 at 15:41
  • I agree wok, as I said before, I don't know if my matrices are defective or not (wikipedia says: "A square matrix which is not diagonalizable is called defective."). The matrix Q is not symmetric (it is triangular). Maybe I should do something with the determinant of the matrix, don't know, I need to dig on mathematic theorems I guess. Merci ;). Commented Sep 20, 2013 at 16:44

3 Answers 3

3

You have already figured out that your eigenvalues will be (0, a, b, c, ..., 1). Let me rename your parameters, so that the eigenvalues are (0, e1, e2, e3, ..., 1). To find out the eigenvector (v0, v1, v2, ..., v(n-1)) corresponding to eigenvalue ej, you have to solve the system of equations:

v1                    = v0*ej
v1*e1 + v2*(1-e1)     = v1*ej
v2*e2 + v3*(1-e2)     = v2*ej
...
vj*ej + v(j+1)*(1-ej) = vj*ej
...
v(n-1)                = v(n-1)*ej

It is more or less clear that if all your ei are distinct, and none is equal to 0 or 1, then the solution is well defined always, and when dealing with ej, the resulting eigenvector has the first j components non-zero, and the rest equal to zero. This guarantees that no eigenvector is a linear combination of the others, and hence that the eigenvector matrix is invertible.

The problem comes when some of your ei is either 0, or 1, or is repeated. I haven't been able to come up with a proof of it, but experimenting with the following code it seems that you should only worry if any two of your ei are equal and different from 1:

>>> def make_mat(values):
...     n = len(values) + 2
...     main_diag = np.concatenate(([0], values, [1]))
...     up_diag = 1 - np.concatenate(([0], values))
...     return np.diag(main_diag) + np.diag(up_diag, k=1)
>>> make_mat([4,5,6])
array([[ 0,  1,  0,  0,  0],
       [ 0,  4, -3,  0,  0],
       [ 0,  0,  5, -4,  0],
       [ 0,  0,  0,  6, -5],
       [ 0,  0,  0,  0,  1]])
>>> a, b = np.linalg.eig(make_mat([4,5,6]))
>>> a
array([ 0.,  4.,  5.,  6.,  1.])
>>> b
array([[ 1.        ,  0.24253563, -0.18641093,  0.13608276,  0.4472136 ],
       [ 0.        ,  0.9701425 , -0.93205465,  0.81649658,  0.4472136 ],
       [ 0.        ,  0.        ,  0.31068488, -0.54433105,  0.4472136 ],
       [ 0.        ,  0.        ,  0.        ,  0.13608276,  0.4472136 ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.4472136 ]])

And now for some test cases:

>>> a, b = np.linalg.eig(make_mat([1,0,3])) # having a 0 or 1 is OK
>>> b
array([[ 1.        ,  0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  1.        ,  0.31622777,  0.57735027],
       [ 0.        ,  0.        ,  0.        ,  0.9486833 ,  0.57735027],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.57735027]])
>>> a, b = np.linalg.eig(make_mat([1,1,3])) # repeating 1 is OK
>>> b
array([[ 1.        ,  0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  1.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  1.        ,  0.70710678],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.70710678]])
>>> a, b = np.linalg.eig(make_mat([0,0,3])) # repeating 0 is not OK
>>> np.round(b, 3)
array([[ 1.   , -1.   ,  1.   ,  0.035,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.105,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.314,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.943,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.447]])
>>> a, b = np.linalg.eig(make_mat([2,3,3])) # repeating other values are not OK
>>> np.round(b, 3)
array([[ 1.   ,  0.447, -0.229, -0.229,  0.447],
       [ 0.   ,  0.894, -0.688, -0.688,  0.447],
       [ 0.   ,  0.   ,  0.688,  0.688,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.447]])
Sign up to request clarification or add additional context in comments.

3 Comments

I am currently reading your answer, thank you. But just a live comment: the sum of each line of my matrix is 1 (because it is a transition matrix of a markov chain).
Yes, that is taken into account. It is the a, b, c... that shouldn't be equal to each other, unless they are equal to 1.
Ah ah indeed :). But a, b, c etc are in fact probabilities. That what I should have said sorry.
3

I am partially answering my question:

According to the source code, I think Numpy is using Exponentiation by Squaring:

# binary decomposition to reduce the number of Matrix
# multiplications for n > 3.
beta = binary_repr(n)
Z, q, t = M, 0, len(beta)
while beta[t-q-1] == '0':
    Z = N.dot(Z, Z)
    q += 1
result = Z
for k in range(q+1, t):
    Z = N.dot(Z, Z)
    if beta[t-k-1] == '1':
        result = N.dot(result, Z)
return result

Which is slower in my case, when n is large, than computing the eigenvalues and eigenvectors and compute M^N as equal to P D^N P^-1.

Now, regarding my questions:

In which case it is not possible to use this last method to compute Q^N?

When some eigenvalues are equal, it will not be possible to invert P. someone has suggested to do it in Numpy on the issue tracker. The answer was: "Your approach is only valid for non-defective dense matrices."

Is it fine to use it in my case (matrix Q as given here)?

Not always, I might have several equal eigenvalues.

Is there in numpy a function that already does it?

I think it is in SciPy: https://github.com/scipy/scipy/blob/v0.12.0/scipy/linalg/matfuncs.py#L57

So we can also do this:

LA.expm(n*LA.logm(m))

to compute m^n.

How can I change the matrix P so it becomes invertible but the resulting matrix is not too altered? I mean, it's ok if the values are close to the real result, by close I mean ~0.0001.

I cannot simply add an epsilon value because the decomposition method is sensible when the values are too close. I'm pretty sure that could lead to unpredictable errors.

1 Comment

The matrix exponent option is interesting (i.e. using la.expm), but it seems to be even slower than matrix_power on my machine. Diagonalizing whenever possible is probably your best bet.
0

In question:

In which case it is not possible to use this last method to compute Q^N?

The key idea is to tell whether Q is diagonalizable. Equivalently, we should tell whether Q has n (number of its rows/columns) linearly independent eigenvectors. Note that distinct eigenvalues are just sufficient but not necessary condition of diagonalizability.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.