Numpy: Matrix Array Shift / Insert by Index

Question

I have an object where I have converted a massive for-loop method into a series of vectorized numpy arrays (about 50x faster). Now I am trying to add a new method where I need to deal with a numpy matrix and then "shift" sub-array contents (i.e. insert values) based on the array index within the matrix. I know I can accomplish this with a for-loop, but am trying to avoid that with the speed-up gains achieved by using vector math instead.

I was wondering if there is a fast and efficient way to accomplish the following:

import numpy as np

period = [1, 2, 3]

x = [1, 10, 100]
y = [.2, .4, .6]

z = np.outer(x,y)

print(z)

Results in:

[[  0.2   0.4   0.6]
 [  2.    4.    6. ]
 [ 20.   40.   60. ]]

I'd like to shift the rows in z to add the number of zeros based on period as the row index in z, basically the following:

[[   0.0   0.2   0.4    0.6 ]
 [   0.0   0.0   2.0    4.0    6.0 ]
 [   0.0   0.0   0.0   20.0   40.0   60.0 ]]

Ultimately, I'd be looking to sum on the vertical / column axis (axis=1). I'd need a final array like the following:

[   0.0   0.2   2.4   24.6   46.0   60.0]

Be careful with time tests. The non iterative answers create an array that is twice the size of the original. Iteratively you could sum the offset rows without that. — hpaulj
– hpaulj, Commented Feb 4, 2017 at 11:52
Is that outer an essential part of the problem, or just a convenient way of creating z? Is this really some sort of inner product or weighted moving sum? — hpaulj
– hpaulj, Commented Feb 4, 2017 at 12:01

hpaulj · Accepted Answer · 2017-02-04 17:30:18Z

[[   0.0   0.2   0.4    0.6 ]
 [   0.0   0.0   2.0    4.0    6.0 ]
 [   0.0   0.0   0.0   20.0   40.0   60.0 ]]

is a ragged list. We can build that with viectorized array magic, at least not with the ordinary stuff.

To get around this we need to either flatten or pad this structure

[[   0.0   0.2   0.4    0.6    0.0    0.0]
 [   0.0   0.0   2.0    4.0    6.0    0.0 ]
 [   0.0   0.0   0.0   20.0   40.0   60.0 ]]

or

[   0.0   0.2   0.4    0.6   0.0   0.0   2.0    4.0    6.0   0.0   0.0   0.0   20.0   40.0   60.0 ]

sum.reduceat lets us sum blocks of a flat array, but you want a skip-sum. I suppose I could explore flattening the transpose.

My first thought was that the padded array looks like a diagonalized one, the [.2,2,20] layed out in on a diagonal, [.4,4,40] on the next offset, and so on. I know sparse can build a matrix from a matrix and set of offsets, but I don't think there's such a function in numpy. They all work with one offset at a time.

But it also looks like the kind of offset that stride_tricks can produce.

Let's explore that:

In [458]: as_strided =np.lib.index_tricks.as_strided

In [459]: Z=np.pad(z,[[0,0],[3,3]],mode='constant')
In [460]: Z
Out[460]: 
array([[  0. ,   0. ,   0. ,   0.2,   0.4,   0.6,   0. ,   0. ,   0. ],
       [  0. ,   0. ,   0. ,   2. ,   4. ,   6. ,   0. ,   0. ,   0. ],
       [  0. ,   0. ,   0. ,  20. ,  40. ,  60. ,   0. ,   0. ,   0. ]])

In [461]: Z.strides
Out[461]: (72, 8)       # prod an offset with (72+8, 8)
In [462]: as_strided(Z,shape=(3,6),strides=(80,8))
Out[462]: 
array([[  0. ,   0. ,   0. ,   0.2,   0.4,   0.6],
       [  0. ,   0. ,   2. ,   4. ,   6. ,   0. ],
       [  0. ,  20. ,  40. ,  60. ,   0. ,   0. ]])

That's the kind of shift we want, but the direction is wrong; so lets flip Z:

In [463]: Z1=Z[::-1,:].copy()
In [464]: as_strided(Z1,shape=(3,6),strides=(80,8))
Out[464]: 
array([[  0. ,   0. ,   0. ,  20. ,  40. ,  60. ],
       [  0. ,   0. ,   2. ,   4. ,   6. ,   0. ],
       [  0. ,   0.2,   0.4,   0.6,   0. ,   0. ]])
In [465]: as_strided(Z1,shape=(3,6),strides=(80,8)).sum(0)
Out[465]: array([  0. ,   0.2,   2.4,  24.6,  46. ,  60. ])

Generalization can be left to the reader.

Whether's any speed advantage is unknown. Probably not for this small case, maybe yes for a very large one.

This cleans up the padding and striding a bit

In [497]: Z=np.pad(z,[[0,0],[1,4]],mode='constant')
In [498]: Z.strides
Out[498]: (64, 8)
In [499]: as_strided(Z,shape=(3,6),strides=(64-8,8))
Out[499]: 
array([[  0. ,   0.2,   0.4,   0.6,   0. ,   0. ],
       [  0. ,   0. ,   2. ,   4. ,   6. ,   0. ],
       [  0. ,   0. ,   0. ,  20. ,  40. ,  60. ]])

This ignores how z was constructed. If the outer product is central to the problem, I might try the striding on the 1d y, and use x to perform a weighted sum.

In [553]: x=np.array([1,10,100]); y=np.array([.2,.4,.6])
In [554]: z=np.concatenate(([0,0],y[::-1],[0,0,0]))
In [555]: z
Out[555]: array([ 0. ,  0. ,  0.6,  0.4,  0.2,  0. ,  0. ,  0. ])
In [556]: Z=as_strided(z,shape=(3,6), strides=(8,8))
In [557]: Z
Out[557]: 
array([[ 0. ,  0. ,  0.6,  0.4,  0.2,  0. ],
       [ 0. ,  0.6,  0.4,  0.2,  0. ,  0. ],
       [ 0.6,  0.4,  0.2,  0. ,  0. ,  0. ]])
In [558]: np.dot(x,Z)
Out[558]: array([ 60. ,  46. ,  24.6,   2.4,   0.2,   0. ])

In this construction Z is a view on z, so is smaller than the Z in the previous. But I sure dot makes a copy when it sends it to the compiled code. np.einsum('i,ij',x,Z) might avoid that, doing its compiled iteration of the view without expanding it. This may make a difference when dealing with very large arrays.

The result is reversed, but that's easily fixed. I may even be able to fix it during construction.

Mike Müller · Accepted Answer · 2017-02-04 18:58:48Z

1

You can also calculate the indices first and assign at once:

a = np.array(
    [[0.2 ,  0.4 ,  0.6],
     [2.,    4.,    6. ],
     [20.,   40.,   60. ]])

s0, s1 = a.shape
rows = np.repeat(np.arange(s0), s1).reshape(a.shape)
cols = (np.add.outer(np.arange(0, s0), np.arange(s1)) + 1)
res = np.zeros((s0, s0 + s1))
res[rows, cols] = a
np.sum(res,axis=0)

>>> np.sum(res,axis=0)
array([  0. ,   0.2,   2.4,  24.6,  46. ,  60. ])

edited Feb 4, 2017 at 18:58

answered Feb 4, 2017 at 5:17

Mike Müller

86.1k21 gold badges174 silver badges165 bronze badges

2 Comments

m3m5rr Over a year ago

This worked great for me with one tweak. The real "a" for me is not a square matrix, so I had to change: rows = np.repeat(np.arange(s0), s0).reshape(a.shape) to rows = np.repeat(np.arange(s0), s1).reshape(a.shape).

Mike Müller Over a year ago

Great. Thanks for the tweak. Updated my answer accordingly for the next help seeker.

Mike Müller · Accepted Answer · 2017-02-04 04:46:29Z

1

Looping over the first dimension works:

a = np.array(
    [[0.2 ,  0.4 ,  0.6],
     [2.,    4.,    6. ],
     [20.,   40.,   60. ]])


s0, s1 = a.shape
res = np.zeros((s0, s0 + s1))
for i in range(s1):
    res[i, i + 1: i + s0 + 1] = a[i] 

>>> np.sum(res,axis=0)
array([  0. ,   0.2,   2.4,  24.6,  46. ,  60. ])

answered Feb 4, 2017 at 4:46

Mike Müller

86.1k21 gold badges174 silver badges165 bronze badges

Collectives™ on Stack Overflow

Numpy: Matrix Array Shift / Insert by Index

3 Answers 3

Comments

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Linked

Related