Numpy modify array in place?

Question

I have the following code which is attempting to normalize the values of an m x n array (It will be used as input to a neural network, where m is the number of training examples and n is the number of features).

However, when I inspect the array in the interpreter after the script runs, I see that the values are not normalized; that is, they still have the original values. I guess this is because the assignment to the array variable inside the function is only seen within the function.

How can I do this normalization in place? Or do I have to return a new array from the normalize function?

import numpy

def normalize(array, imin = -1, imax = 1):
    """I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)"""

    dmin = array.min()
    dmax = array.max()

    array = imin + (imax - imin)*(array - dmin)/(dmax - dmin)
    print array[0]


def main():

    array = numpy.loadtxt('test.csv', delimiter=',', skiprows=1)
    for column in array.T:
        normalize(column)

    return array

if __name__ == "__main__":
    a = main()

senderle · Accepted Answer · 2012-04-14 00:05:23Z

If you want to apply mathematical operations to a numpy array in-place, you can simply use the standard in-place operators +=, -=, /=, etc. So for example:

>>> def foo(a):
...     a += 10
... 
>>> a = numpy.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> foo(a)
>>> a
array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

The in-place version of these operations is a tad faster to boot, especially for larger arrays:

>>> def normalize_inplace(array, imin=-1, imax=1):
...         dmin = array.min()
...         dmax = array.max()
...         array -= dmin
...         array *= imax - imin
...         array /= dmax - dmin
...         array += imin
...     
>>> def normalize_copy(array, imin=-1, imax=1):
...         dmin = array.min()
...         dmax = array.max()
...         return imin + (imax - imin) * (array - dmin) / (dmax - dmin)
... 
>>> a = numpy.arange(10000, dtype='f')
>>> %timeit normalize_inplace(a)
10000 loops, best of 3: 144 us per loop
>>> %timeit normalize_copy(a)
10000 loops, best of 3: 146 us per loop
>>> a = numpy.arange(1000000, dtype='f')
>>> %timeit normalize_inplace(a)
100 loops, best of 3: 12.8 ms per loop
>>> %timeit normalize_copy(a)
100 loops, best of 3: 16.4 ms per loop

The version I use here is only built in to ipython. But it's based on the timeit function in the timeit module.
Ah finally looked at ipython. Funny I had always associated it with ironpython, mistakenly I now see.
@User, yeah it's quite useful at times. I usually just use the regular python shell, but for timings, the %timeit "magic command" in incredibly handy, because it takes care of all the awkward setup for you.

Ian Hincks · Accepted Answer · 2019-03-08 21:13:32Z

This is a trick that it is slightly more general than the other useful answers here:

def normalize(array, imin = -1, imax = 1):
    """I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)"""

    dmin = array.min()
    dmax = array.max()

    array[...] = imin + (imax - imin)*(array - dmin)/(dmax - dmin)

Here we are assigning values to the view array[...] rather than assigning these values to some new local variable within the scope of the function.

x = np.arange(5, dtype='float')
print x
normalize(x)
print x

>>> [0. 1. 2. 3. 4.]
>>> [-1.  -0.5  0.   0.5  1. ]

EDIT:

It's slower; it allocates a new array. But it may be valuable if you are doing something more complicated where builtin in-place operations are cumbersome or don't suffice.

def normalize2(array, imin=-1, imax=1):
    dmin = array.min()
    dmax = array.max()

    array -= dmin;
    array *= (imax - imin)
    array /= (dmax-dmin)
    array += imin

A = np.random.randn(200**3).reshape([200] * 3)
%timeit -n5 -r5 normalize(A)
%timeit -n5 -r5 normalize2(A)

>> 47.6 ms ± 678 µs per loop (mean ± std. dev. of 5 runs, 5 loops each)
>> 26.1 ms ± 866 µs per loop (mean ± std. dev. of 5 runs, 5 loops each)

ely · Accepted Answer · 2012-04-13 23:23:46Z

4

def normalize(array, imin = -1, imax = 1):
    """I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)"""

    dmin = array.min()
    dmax = array.max()


    array -= dmin;
    array *= (imax - imin)
    array /= (dmax-dmin)
    array += imin

    print array[0]

answered Apr 13, 2012 at 23:23

ely

77.8k36 gold badges157 silver badges233 bronze badges

2 Comments

User Over a year ago

Performance-wise is there any issue doing it this way? How does it compare to creating a new array?

ely Over a year ago

I mean, for that you'd have to benchmark. It depends on the size of the array. For small-ish problems, I would certainly just create the new array.

salomonvh · Accepted Answer · 2017-08-31 07:07:23Z

There is a nice way to do in-place normalization when using numpy. np.vectorize is is very usefull when combined with a lambda function when applied to an array. See the example below:

import numpy as np

def normalizeMe(value,vmin,vmax):

    vnorm = float(value-vmin)/float(vmax-vmin)

    return vnorm

imin = 0
imax = 10
feature = np.random.randint(10, size=10)

# Vectorize your function (only need to do it once)
temp = np.vectorize(lambda val: normalizeMe(val,imin,imax)) 
normfeature = temp(np.asarray(feature))

print feature
print normfeature

One can compare the performance with a generator expression, however there are likely many other ways to do this.

%%timeit
temp = np.vectorize(lambda val: normalizeMe(val,imin,imax)) 
normfeature1 = temp(np.asarray(feature))
10000 loops, best of 3: 25.1 µs per loop


%%timeit
normfeature2 = [i for i in (normalizeMe(val,imin,imax) for val in feature)]
100000 loops, best of 3: 9.69 µs per loop

%%timeit
normalize(np.asarray(feature))
100000 loops, best of 3: 12.7 µs per loop

So vectorize is definitely not the fastest, but can be conveient in cases where performance is not as important.

It does the job, but it is very slow since it is implemented like a for-loop, according to the documentation.
Are there any benchmarks for this kind of thing? You'd hope that vectorize might help it go much faster.

Collectives™ on Stack Overflow

Numpy modify array in place?

4 Answers 4

4 Comments

1 Comment

2 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

1 Comment

2 Comments

2 Comments

Linked

Related