calculations for different columns in a numpy array

Question

I have a 2D array with filled with some values (column 0) and zeros (rest of the columns). I would like to do pretty much the same as I do with MS excel but using numpy, meaning to put into the rest of the columns values from calculations based on the first column. Here it is a MWE:

import numpy as np

a = np.zeros(20, dtype=np.int8).reshape(4,5)

b = [1, 2, 3, 4]

b = np.array(b)

a[:, 0] = b

# don't change the first column
for column in a[:, 1:]:
    a[:, column] = column[0]+1

The expected output:

array([[1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8]], dtype=int8)

The resulting output:

array([[1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0]], dtype=int8)

Any help would be appreciated.

Print the value of `column1 in the iteration. Is it the column value or index? — hpaulj
– hpaulj, Commented Aug 27, 2016 at 19:47

John1024 · Accepted Answer · 2016-08-27 20:40:44Z

Looping is slow and there is no need to loop to produce the array that you want:

>>> a = np.ones(20, dtype=np.int8).reshape(4,5)
>>> a[:, 0] = b
>>> a
array([[1, 1, 1, 1, 1],
       [2, 1, 1, 1, 1],
       [3, 1, 1, 1, 1],
       [4, 1, 1, 1, 1]], dtype=int8)
>>> np.cumsum(a, axis=1)
array([[1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8]])

What went wrong

Let's start, as in the question, with this array:

>>> a
array([[1, 0, 0, 0, 0],
       [2, 0, 0, 0, 0],
       [3, 0, 0, 0, 0],
       [4, 0, 0, 0, 0]], dtype=int8)

Now, using the code from the question, let's do the loop and see what column actually is:

>>> for column in a[:, 1:]:
...   print(column)
... 
[0 0 0 0]
[0 0 0 0]
[0 0 0 0]
[0 0 0 0]

As you can see, column is not the index of the column but the actual values in the column. Consequently, the following does not do what you would hope:

a[:, column] = column[0]+1

Another method

If we want to loop (so that we can do something more complex), here is another approach to generating the desired array:

>>> b = np.array([1, 2, 3, 4])
>>> np.column_stack([b+i for i in range(5)])
array([[1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8]])

Good start but np.cumsum approach doesn't work if a is zeroes.
@Kasramvd Good comment but notice that that is why I created a with ones in it when using cumsum.
@John1024 Thanks for the explanation. But what if I want to make more elaborated calculations rather than sum 1?
Maybe that's not what OP want's. Besides it's interesting to solve that too :).
@david_doji I just added to the answer another method that could be extended to do something more elaborate than sum ones.

akuiper · Accepted Answer · 2016-08-27 19:45:51Z

1

Your usage of column is a little ambiguous: in for column in a[:, 1:], it is treated as a column and in the body, however, it is treated as index to the column. You can try this instead:

for column in range(1, a.shape[1]):
    a[:, column] = a[:, column-1]+1

a
#array([[1, 2, 3, 4, 5],
#       [2, 3, 4, 5, 6],
#       [3, 4, 5, 6, 7],
#       [4, 5, 6, 7, 8]], dtype=int8)

answered Aug 27, 2016 at 19:45

akuiper

216k33 gold badges362 silver badges379 bronze badges

Collectives™ on Stack Overflow

calculations for different columns in a numpy array

2 Answers 2

What went wrong

Another method

6 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

What went wrong

Another method

6 Comments

Comments

Related