numpy.array's have bizarre behavior with /= operator?

Question

I'm trying to normalize an array of numbers to the range (0, 1] so that I can use them as weights for a weighted random.choice(), so I entered this line:

# weights is a nonzero numpy.array
weights /= weights.max()

However, Pycharm said there's an unfilled parameter to the max() function (Parameter 'initial' unfilled). I tried this in the REPL with the /= operator and with "regular" division (a = a / b) and got different results for both and a different error than Pycharm thought:

>>> a = numpy.array([1,2,3])
>>> a.max()
3
>>> a /= a.max()
Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    a /= a.max()
TypeError: No loop matching the specified signature and casting was found for ufunc true_divide
>>> a = a/a.max()
>>> a
array([0.33333333, 0.66666667, 1.        ])

I also realized that for a weighted random, the weights needed to sum to one rather than be normalized to it. But dividing it by the sum yielded the exact same TypeError using the /= operation (but Pycharm thought this was okay):

>>> a = numpy.array([1,2,3])
>>> sum(a)
6
>>> a
array([1, 2, 3])
>>> a /= sum(a)
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    a /= sum(a)
TypeError: No loop matching the specified signature and casting was found for ufunc true_divide
>>> a = a / sum(a)
>>> a
array([0.16666667, 0.33333333, 0.5       ])

What have I come across here? Is this some bizarre bug in Numpy or does the /= operator have a different use or something? I know they use __truediv__ and __itruediv__ but I can't see why one has a problem and the other doesn't. I have confirmed this behavior with the latest version of Numpy from pip (1.19.2 on Windows x64).

What's the a.dtype? What do you expect after the inplace division? — hpaulj
– hpaulj, Commented Sep 11, 2020 at 23:04
Try converting a to np.float32. In-place division is not defined on integers due to the possibility of requiring type conversion to a floating-point representation. — Mateen Ulhaq
– Mateen Ulhaq, Commented Sep 11, 2020 at 23:06
Either float64 or int32 (which will get converted to float64 as a result of the operation I'm trying to do). Both array types exhibit this problem. — ntoskrnl4
– ntoskrnl4, Commented Sep 11, 2020 at 23:09
@G.Anderson AHA! Indeed using floats as the initialization values causes it to work properly. That's really interesting. It kind of suggests to me that what I found could be a bug, but it could also be intended behavior as suggested by @ Mateen Ulhaq. — ntoskrnl4
– ntoskrnl4, Commented Sep 11, 2020 at 23:20
accidentally deleted my previous comment. np.array([1.0,2.0,3.0]) works but np.array([1,2,3]) throws the error — G. Anderson
– G. Anderson, Commented Sep 11, 2020 at 23:22

Fahim · Accepted Answer · 2020-09-11 23:24:18Z

1

It's because numpy cares the type. When you apply the division, you're changing int to float. But numpy won't let you do that! That's why your values should be already in float. Try this:

>>> a = np.array([1.0,2.0,3.0])
>>> a /= sum(a)
>>> a
array([0.16666667, 0.33333333, 0.5       ])

But why did the other one work? It's because that's not an "in-place" operation. Hence a new memory location is being created. New variable, new type, hence numpy doesn't care here.

answered Sep 11, 2020 at 23:24

Fahim

3481 gold badge3 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

ntoskrnl4 Over a year ago

So it's still internally an int type even though a.dtype returns float64?

Fahim Over a year ago

When you assign the same name to 2 different memory locations, the newer one gets the name and the old one no longer has a reference. That old one then falls under Garbage Collection. Read this Object References and Garbage Collection written by me. Try to read the entire blog to understand it properly.

ntoskrnl4 Over a year ago

I'm well familiar with Python's garbage collection and how variables work. I'm just slightly confused that numpy has this error with integer types, despite it claiming that the array's datatype is a float.

Fahim Over a year ago

As I told you, it's because the type is changing. Try a *= sum(a), it won't return any error. Because here the type is not changing.

Mad Physicist Over a year ago

@ntoskrnl4. If the string representation of an array has no decimal points after the numbers, the dtype is an integer type. numpy.array([1,2,3]) creates an array with dtype np.int_. Not sure why you think a is a float...

|

Collectives™ on Stack Overflow

numpy.array's have bizarre behavior with /= operator?

1 Answer 1

6 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Linked

Related