Replace null values in pandas data frame column with 2D np.zeros() array

Question

Assume the following data frame:

import pandas as pd
import numpy as np

vals = [1, 2, 3, 4, 5]

df = pd.DataFrame({'val': vals})
df['val'][[0, 3]] = np.nan

Gives:

    val
0   NaN
1   2.0
2   3.0
3   NaN
4   5.0

I need to be able to replace NaN values in the val column with a 2D numpy array of zeros. When I do the following:

z = np.zeros((10, 10))

df['val'][df['val'].isnull()] = z

The arrays are converted to scalars of value 0.0:

I really need the array to be maintained (in this case, each NaN value - rows 0 and 3 from the original data frame - should be replaced with a 10x10 array of zeros). I've tried converting to object type first

df = df.astype(object)
df['val'][df['val'].isnull()] = z

With no success. Whhyyyyy

So for 0 in val, do you want the 0th array of z, and for 3 in val, you want the 3rd item from z? — user17242583
– user17242583, Commented Dec 23, 2021 at 1:30

BENY · Accepted Answer · 2021-12-23 01:51:53Z

1

It is cause by the object data type we have a way with fillna

df.val.fillna(dict(zip(df.index[df['val'].isnull()],[z]*df['val'].isnull().sum())),inplace=True)
df
                                                 val
0  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
1                                                2.0
2                                                3.0
3  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
4                                                5.0

edited Dec 23, 2021 at 1:51

answered Dec 23, 2021 at 1:31

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

JmeCS Over a year ago

Almost, but this replaces values with an array of shape (10,). I really need it to be replaced with z, which is shape (10,10) - a 2d array. Try df['val'][0].shape

BENY Over a year ago

@JmeCS check the update

JmeCS Over a year ago

Many thanks! This is gnarly. Really unclear to my why it drops the array structure to a scalar in the first place...

user17242583 Over a year ago

@JmeCS check my answer. It's simpler, and it might work for you.

JmeCS Over a year ago

@richardec I upvoted it but it doesn't work for my real-world problem unfortunately. Not clear why. It's too complicated to replicate here but your solution definitely works for the dummy problem.

score 1 · Accepted Answer · 2021-12-23 01:55:15Z

1

You were really close. Change the dataframe's dtype to object and change = z to = [z]:

df = df.astype(object)
df.loc[df['val'].isna(), 'val'] = [z]

Output:

>>> df
                                                 val
0  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
1                                                2.0
2                                                3.0
3  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
4                                                5.0

edited Dec 23, 2021 at 1:55

answered Dec 23, 2021 at 1:34

user17242583

Collectives™ on Stack Overflow

Replace null values in pandas data frame column with 2D np.zeros() array

2 Answers 2

5 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Related