Where clause with numpy with single array and / or empty_like

Question

I am trying to figure out how the np.where clause works. I create a simple df:

np.random.seed(1)
df = pd.DataFrame(np.random.randint(0, 10, size=(3, 4)), columns=list('ABCD'))
print(df)

   A  B  C  D
0  5  8  9  5
1  0  0  1  7
2  6  9  2  4

Now when I implement:

print(np.where(df.values, 1, np.nan))

I receive:

[[  1.   1.   1.   1.]
 [ nan  nan   1.   1.]
 [  1.   1.   1.   1.]]

But when I create an empty_like array from df: and put it into where clause I receive this:

print(np.where(np.empty_like(df.values), 1, np.nan))

[[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]

Really could use help on explaining how where clause works on a single array.

Why are you using np.empty_like? Note that its values will not be 0, and thus none of them will be falsey, which is why np.where returns an ndarray of ones — yatu
– yatu, Commented Mar 13, 2019 at 12:50
Hello @yatu The empty_like actually produces values of zero but the result is all nan unlike OP's. can't reproduce the problem — Mohit Motwani
– Mohit Motwani, Commented Mar 13, 2019 at 12:58
empty_like creates an array of abritrary data. So yes some of the of the time none of it is 0. — ALollz
– ALollz, Commented Mar 13, 2019 at 13:03
np.empty_like was in one case I've found on the Internet for my problem, but then realized that without it it works fine, so how to read the first implementation like np.where(array, 1, np.nan)? — cincin21
– cincin21, Commented Mar 13, 2019 at 13:04

Justice_Lords · Accepted Answer · 2019-03-13 13:48:52Z

np.empty_like()

Docs:-

numpy.empty_like(prototype, dtype=None, order='K', subok=True)

Return a new array with the same shape and type as a given array.

>>> a = ([1,2,3], [4,5,6])                         # a is array-like
>>> np.empty_like(a)
array([[-1073741821, -1073741821,           3],    #random
       [          0,           0, -1073741821]])

np.empty_like() creates an array of the same shape and type as the given array but with random numbers. This array now goes into np.where()

numpy.where()

Docs:-

numpy.where(condition[, x, y])

Return elements that are chosen from x or y depending on condition.

Example:-

>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.where(a < 5, a, 10*a)
array([ 0,  1,  2,  3,  4, 50, 60, 70, 80, 90])
>>>np.where(a,1,np.nan)
array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

In Python any number other than zero is considered to be TRUE whereas zero is considered to FALSE.

When np.where() gets a np.array it checks for the condition, Here the array itself acts as condition i.e, the np.where evaluates to TRUE when the array elements are not zero and FALSE when they are 0. So the "True" elements are replaced by 1 and "False" elements by np.nan.

Reference:-

"... any number other than zero is considered to be TRUE whereas zero is considered to FALSE." - that is what I was missing, great !

Collectives™ on Stack Overflow

Where clause with numpy with single array and / or empty_like

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related