Vectorizing matrix multiplications in NumPy

Ask Question

Asked 5 years, 2 months ago

Modified 5 years, 2 months ago

Viewed 88 times

Let x be an unevenly spaced strictly increasing array (e.g., x = np.array([0, .1, .3, .7, .99])). I have the following spline functions:

from scipy.interpolate import InterpolatedUnivariateSpline
a_ = InterpolatedUnivariateSpline(x, a)
b_ = InterpolatedUnivariateSpline(x, b)
c_ = InterpolatedUnivariateSpline(x, c)
d_ = InterpolatedUnivariateSpline(x, d)
e_ = InterpolatedUnivariateSpline(x, e)
f_ = InterpolatedUnivariateSpline(x, f)
g_ = InterpolatedUnivariateSpline(x, g)
w_ = InterpolatedUnivariateSpline(x, w)

which are built from numpy arrays a,b,c,d,e,f,g,w of the same length as x.

Let w be an array with shape (2,len(x)). I am now doing the following loop:

y1 = []
y2 = []
for ii in range(len(x)): 
    xi = x[ii]
    
    A_1 = np.array([[  a_(xi), -a_(xi)], 
                    [       0, -b_(xi)]])
    
    B_1 = np.array([[ -a_(xi),      0], 
                    [  b_(xi),      0]])
    
    C_1 = np.array([[       1,      0], 
                    [  c_(xi),      0]])
    
    D_1 = np.array([[       1,      0], 
                    [ -d_(xi),      1]])
    
    D_2 = np.array([[      -1,      0], 
                    [       0,  e(xi)]])
    
    Amat = A_1 + B_1 @ np.linalg.inv(D_1) @ C_1
    Bmat = B_1 @ np.linalg.inv(D_1) @ D_2 
    Cmat = np.linalg.inv(D_1) @ C_1
    Dmat = np.linalg.inv(D_1) @ D_2

    K1 = np.array([f_(xi), g_(xi)])

    y1 += [-Amat.T@w  - Cmat.T@K1]
    y2 += [ Dmat.T@K1 + Bmat.T@w ]

This works, however, it is very slow, as x is large and I need to do this loop many times.

My feeling is that I should be able to vectorize all of this code and it will be very fast. However, when I try to do so, I run into problems with the @ matrix multiplications:

MemoryError: Unable to allocate array with shape (4800, 4800, 4800) and data type float64

Here is my attempt, with the code that produces that error:

nil = np.zeros(len(x))
A_1 = np.array([[ a_(x), -a_(x)], 
                [   nil, -b_(x)]])

B_1 = np.array([[-a_(x),    nil], 
                [ b_(x),    nil]])

C_1 = np.array([[ 1+nil,    nil], 
                [ c_(x),    nil]])

D_1 = np.array([[ 1+nil,    nil], 
                [-d_(x),  1+nil]])

D_2 = np.array([[ nil-1,    nil], 
                [   nil,  e_(x)]])

Amat = A_1 + B_1 @ np.linalg.pinv(D_1) @ C_1
Bmat = B_1 @ np.linalg.pinv(D_1) @ D_2
Cmat = np.linalg.pinv(D_1) @ C_1
Dmat = np.linalg.pinv(D_1) @ D_2

K1 = np.array([f_(x), g_(x)])

Cmat.T@K1 # produces the error

At a basic level I can move the creation of the A_1, etc matrices outside of the loop, and that speeds it up. However, is it possible to speed it up even more?

edited Aug 18, 2020 at 13:30

asked Aug 18, 2020 at 13:14

rhombidodecahedron

7,94214 gold badges63 silver badges97 bronze badges

since you do matrix inversion, I don't think that using "large" matrix will lead to a faster code.. what is the time difference without Cmat.T@K1 line (and without equivalent code in the first approach)?

Jan Stránský
– Jan Stránský

2020-08-18 13:49:32 +00:00
Commented Aug 18, 2020 at 13:49
1

What's the shape of the arrays? Is the shape in the error expected?

hpaulj
– hpaulj

2020-08-18 14:07:07 +00:00
Commented Aug 18, 2020 at 14:07
Can you add a full working example? It looks like there is something wrong: y1 += [-Amat.T@w - Cmat.T@K1]. The interpolation of the a_,.. can be easily vectorized. There are for sure large speedups possible (you have loads of temporary arrays, calling BLAS routines with tiny arrays stackoverflow.com/a/49613797/4045774 ,etc. With a fully working example (also includes all inputs, which could be random) it should be straight forward to drastically speed this up using cython or numba without any additional memory overhead.

max9111
– max9111

2020-08-18 15:40:40 +00:00
Commented Aug 18, 2020 at 15:40
@max, full working examples involving memory errors are a pain to work with. Different size memories, long run times etc.

hpaulj
– hpaulj

2020-08-18 17:39:07 +00:00
Commented Aug 18, 2020 at 17:39
@hpaulj the shape in the error is not expected. The arrays are (2,2,4800), where 4800 is the length of x. In the loop example, each iteration (over the 4800) builds these little 2x2 matrices.

rhombidodecahedron
– rhombidodecahedron

2020-08-18 19:36:32 +00:00
Commented Aug 18, 2020 at 19:36

| Show 6 more comments

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Vectorizing matrix multiplications in NumPy

0

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Linked