Sum along axis in numpy array

Question

I want to understand how this ndarray.sum(axis=) works. I know that axis=0 is for columns and axis=1 is for rows. But in case of 3 dimensions(3 axes) its difficult to interpret below result.

arr = np.arange(0,30).reshape(2,3,5)

arr
Out[1]: 
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29]]])

arr.sum(axis=0)
Out[2]: 
array([[15, 17, 19, 21, 23],
       [25, 27, 29, 31, 33],
       [35, 37, 39, 41, 43]])


arr.sum(axis=1)
Out[8]: 
array([[15, 18, 21, 24, 27],
       [60, 63, 66, 69, 72]])

arr.sum(axis=2)
Out[3]: 
array([[ 10,  35,  60],
       [ 85, 110, 135]])

Here in this example of 3 axes array of shape(2,3,5), there are 3 rows and 5 columns. But if i look at this array as whole, seems like only two rows (both with 3 array elements).

Can anyone please explain how this sum works on array of 3 or more axes(dimensions).

MSeifert · Accepted Answer · 2017-01-19 04:08:20Z

If you want to keep the dimensions you can specify keepdims:

>>> arr = np.arange(0,30).reshape(2,3,5)
>>> arr.sum(axis=0, keepdims=True)
array([[[15, 17, 19, 21, 23],
        [25, 27, 29, 31, 33],
        [35, 37, 39, 41, 43]]])

Otherwise the axis you sum along is removed from the shape. An easy way to keep track of this is using the numpy.ndarray.shape property:

>>> arr.shape
(2, 3, 5)

>>> arr.sum(axis=0).shape
(3, 5)  # the first entry (index = axis = 0) dimension was removed 

>>> arr.sum(axis=1).shape
(2, 5)  # the second entry (index = axis = 1) was removed

You can also sum along multiple axis if you want (reducing the dimensionality by the amount of specified axis):

>>> arr.sum(axis=(0, 1))
array([75, 81, 87, 93, 99])
>>> arr.sum(axis=(0, 1)).shape
(5, )  # first and second entry is removed

akuiper · Accepted Answer · 2017-01-19 04:14:52Z

Here is another way to interpret this. You can consider a multi-dimensional array as a tensor, T[i][j][k], while i, j, k represents axis 0,1,2 respectively.

T.sum(axis = 0) mathematically will be equivalent to:

Similary, T.sum(axis = 1):

And, T.sum(axis = 2):

So in another word, the axis will be summed over, for instance, axis = 0, the first index will be summed over. If written in a for loop:

result[j][k] = sum(T[i][j][k] for i in range(T.shape[0])) for all j,k

for axis = 1:

result[i][k] = sum(T[i][j][k] for j in range(T.shape[1])) for all i,k

etc.

hpaulj · Accepted Answer · 2021-05-22 07:23:40Z

numpy displays a (2,3,5) array as 2 blocks of 3x5 arrays (3 rows, 5 columns). Or call them 'planes' (MATLAB would show it as 5 blocks of 2x3).

The numpy display also matches a nested list - a list of two sublists; each with 3 sublists. Each of those is 5 elements long.

In the 3x5 2d case, axis 0 sums along the size 3 dimension, resulting in a 5 element array. The descriptions 'sum over rows' or 'sum along colulmns' are a little vague in English. Focus on the results, the change in shape, and which values are being summed, not on the description.

Back to the 3d case:

With axis=0, it sums along the 1st dimension, effectively removing it, leaving us with a 3x5 array. 0+15=16, 1+16=17 etc.

Axis 1, condenses the size 3 dimension, result is 2x5. 0+5+10=15, etc.

Axis 2, condense the size 5 dimenson, result is 2x3, sum((0,1,2,3,4))

Your example is good, since the 3 dimensions are different, and it is easier to see which one was eliminated during the sum.

With 2d there's some ambiguity; 'sum over rows' - does that mean the rows are eliminated or retained? With 3d there's no ambiguity; with axis=0, you can only remove it, leaving the other 2.

Whilst I understand the explanations (which are very good) it seems problematic that frameworks/languages would treat the dimensions differently (2 blocks of 3x5 vs 5 blocks 2x3). I am used to thinking in terms of x,y,z with x left to right screen bottom, y bottom to top screen, and z increasing from the screen towards me. In that visualization I may get results different to my expectations depending upon the package used.
@phil, 3d, even 2d, can be displayed in different ways. 3d graphs often are x to right, z up, y diagonal to left ("out of the page"), Images often are x, width, y height, down. Color may be thought of as 3 color planes, or a property of each pixel. Tables are rows (down) and columns (across). numpy array dimensions are abstract, capable of storing and manipulating values for any of these. Don't forget they can also be 0d, or 6d.

John Zwinck · Accepted Answer · 2017-01-19 03:59:25Z

0

The axis you specify is the one that is effectively removed. So given a shape of (2,3,5), axis 0 gives (3,5), axis 1 gives (2,5), etc. This extends to any number of dimensions.

answered Jan 19, 2017 at 3:59

John Zwinck

252k44 gold badges346 silver badges459 bronze badges

Comments

Daniel F · Accepted Answer · 2017-01-19 07:08:55Z

You seem to be confused by the output style of numpy arrays. The "row" of the output is almost always the last index, not the first. Example:

x=np.arange(1,4)
y=np.arange(10,31,10)
z=np.arange(100,301,100)
xy=x[:,None]+y[None,:]

xy
Out[100]: 
array([[11, 21, 31],
       [12, 22, 32],
       [13, 23, 33]])

Notice the tens place increments on the row, not the column, even though y is the second index.

xyz=x[:,None,None]+y[None,:,None]+z[None,None,:]
xyz
Out[102]: 
array([[[111, 211, 311],
        [121, 221, 321],
        [131, 231, 331]],

       [[112, 212, 312],
        [122, 222, 322],
        [132, 232, 332]],

       [[113, 213, 313],
        [123, 223, 323],
        [133, 233, 333]]])

Now the hundred's place increments in the row, even though z is the last index. This can be somewhat counter-intuitive to beginners.

Thus when you do np.sum(x,index=-1) you will always sum over the "rows" as shown in the np.array([]) format. Looking at the arr.sum(axis=2)[0,0] that's 0+1+2+3+4=10.

Ananth Raghuraman · Accepted Answer · 2017-10-01 07:06:14Z

0

Think of a multi-dimensional array as a tree. Each dimension is a level in the tree. Each grouping at that level is a node. A sum along a specific axis (say axis=4) means coalescing (overlaying) all nodes at that level into a single node (under their respective parents). Sub-trees rooted at the overlaid nodes at that level are stacked on top of each other. All overlapping nodes' values are added together.
Picture: https://ibb.co/dg3P3w

edited Oct 1, 2017 at 7:06

answered Sep 28, 2017 at 17:47

Ananth Raghuraman

997 bronze badges

Comments

PattiMichelle Sheaffer · Accepted Answer · 2020-05-22 17:52:29Z

It's maybe a little easier to see with a simpler 3D array. After filling the array with ones, the numbers in the sums come out to be the size of the particular dimension summed over! The other two dimensions in each case are left intact.

arr = np.arange(0,60).reshape(4,3,5)
arr
Out[10]: 
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29]],

       [[30, 31, 32, 33, 34],
        [35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44]],

       [[45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59]]])

arr=arr*0+1

arr
Out[12]: 
array([[[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]]])

arr0=arr.sum(axis=0,keepdims=True)
arr2=arr.sum(axis=2,keepdims=True)
arr1=arr.sum(axis=1,keepdims=True)

arr0
Out[20]: 
array([[[4, 4, 4, 4, 4],
        [4, 4, 4, 4, 4],
        [4, 4, 4, 4, 4]]])

arr1
Out[21]: 
array([[[3, 3, 3, 3, 3]],

       [[3, 3, 3, 3, 3]],

       [[3, 3, 3, 3, 3]],

       [[3, 3, 3, 3, 3]]])

arr2
Out[22]: 
array([[[5],
        [5],
        [5]],

       [[5],
        [5],
        [5]],

       [[5],
        [5],
        [5]],

       [[5],
        [5],
        [5]]])

Collectives™ on Stack Overflow

Sum along axis in numpy array

7 Answers 7

Comments

Comments

2 Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Comments

Comments

2 Comments

Comments

Comments

Comments

Comments

Linked

Related