[[   0.0   0.2   0.4    0.6 ]
 [   0.0   0.0   2.0    4.0    6.0 ]
 [   0.0   0.0   0.0   20.0   40.0   60.0 ]]
is a ragged list.  We can build that with viectorized array magic, at least not with the ordinary stuff.
To get around this we need to either flatten or pad this structure
[[   0.0   0.2   0.4    0.6    0.0    0.0]
 [   0.0   0.0   2.0    4.0    6.0    0.0 ]
 [   0.0   0.0   0.0   20.0   40.0   60.0 ]]
or 
[   0.0   0.2   0.4    0.6   0.0   0.0   2.0    4.0    6.0   0.0   0.0   0.0   20.0   40.0   60.0 ] 
sum.reduceat lets us sum blocks of a flat array, but you want a skip-sum.  I suppose I could explore flattening the transpose.
My first thought was that the padded array looks like a diagonalized one, the [.2,2,20] layed out in on a diagonal, [.4,4,40] on the next offset, and so on.  I know sparse can build a matrix from a matrix and set of offsets, but I don't think there's such a function in numpy.  They all work with one offset at a time.
But it also looks like the kind of offset that stride_tricks can produce.
Let's explore that:
In [458]: as_strided =np.lib.index_tricks.as_strided
In [459]: Z=np.pad(z,[[0,0],[3,3]],mode='constant')
In [460]: Z
Out[460]: 
array([[  0. ,   0. ,   0. ,   0.2,   0.4,   0.6,   0. ,   0. ,   0. ],
       [  0. ,   0. ,   0. ,   2. ,   4. ,   6. ,   0. ,   0. ,   0. ],
       [  0. ,   0. ,   0. ,  20. ,  40. ,  60. ,   0. ,   0. ,   0. ]])
In [461]: Z.strides
Out[461]: (72, 8)       # prod an offset with (72+8, 8)
In [462]: as_strided(Z,shape=(3,6),strides=(80,8))
Out[462]: 
array([[  0. ,   0. ,   0. ,   0.2,   0.4,   0.6],
       [  0. ,   0. ,   2. ,   4. ,   6. ,   0. ],
       [  0. ,  20. ,  40. ,  60. ,   0. ,   0. ]])
That's the kind of shift we want, but the direction is wrong; so lets flip Z:
In [463]: Z1=Z[::-1,:].copy()
In [464]: as_strided(Z1,shape=(3,6),strides=(80,8))
Out[464]: 
array([[  0. ,   0. ,   0. ,  20. ,  40. ,  60. ],
       [  0. ,   0. ,   2. ,   4. ,   6. ,   0. ],
       [  0. ,   0.2,   0.4,   0.6,   0. ,   0. ]])
In [465]: as_strided(Z1,shape=(3,6),strides=(80,8)).sum(0)
Out[465]: array([  0. ,   0.2,   2.4,  24.6,  46. ,  60. ])
Generalization can be left to the reader.
Whether's any speed advantage is unknown.  Probably not for this small case, maybe yes for a very large one.
This cleans up the padding and striding a bit
In [497]: Z=np.pad(z,[[0,0],[1,4]],mode='constant')
In [498]: Z.strides
Out[498]: (64, 8)
In [499]: as_strided(Z,shape=(3,6),strides=(64-8,8))
Out[499]: 
array([[  0. ,   0.2,   0.4,   0.6,   0. ,   0. ],
       [  0. ,   0. ,   2. ,   4. ,   6. ,   0. ],
       [  0. ,   0. ,   0. ,  20. ,  40. ,  60. ]])
This ignores how z was constructed.  If the outer product is central to the problem, I might try the striding on the 1d y, and use x to perform a weighted sum.
In [553]: x=np.array([1,10,100]); y=np.array([.2,.4,.6])
In [554]: z=np.concatenate(([0,0],y[::-1],[0,0,0]))
In [555]: z
Out[555]: array([ 0. ,  0. ,  0.6,  0.4,  0.2,  0. ,  0. ,  0. ])
In [556]: Z=as_strided(z,shape=(3,6), strides=(8,8))
In [557]: Z
Out[557]: 
array([[ 0. ,  0. ,  0.6,  0.4,  0.2,  0. ],
       [ 0. ,  0.6,  0.4,  0.2,  0. ,  0. ],
       [ 0.6,  0.4,  0.2,  0. ,  0. ,  0. ]])
In [558]: np.dot(x,Z)
Out[558]: array([ 60. ,  46. ,  24.6,   2.4,   0.2,   0. ])
In this construction Z is a view on z, so is smaller than the Z in the previous.  But I sure dot makes a copy when it sends it to the compiled code.  np.einsum('i,ij',x,Z) might avoid that, doing its compiled iteration of the view without expanding it.  This may make a difference when dealing with very large arrays.
The result is reversed, but that's easily fixed.  I may even be able to fix it during construction.
     
    
outeran essential part of the problem, or just a convenient way of creatingz? Is this really some sort of inner product or weighted moving sum?