0

Assume the following data frame:

import pandas as pd
import numpy as np

vals = [1, 2, 3, 4, 5]

df = pd.DataFrame({'val': vals})
df['val'][[0, 3]] = np.nan

Gives:

    val
0   NaN
1   2.0
2   3.0
3   NaN
4   5.0

I need to be able to replace NaN values in the val column with a 2D numpy array of zeros. When I do the following:

z = np.zeros((10, 10))

df['val'][df['val'].isnull()] = z

The arrays are converted to scalars of value 0.0:

    val
0   0.0
1   2.0
2   3.0
3   0.0
4   5.0

I really need the array to be maintained (in this case, each NaN value - rows 0 and 3 from the original data frame - should be replaced with a 10x10 array of zeros). I've tried converting to object type first

df = df.astype(object)
df['val'][df['val'].isnull()] = z

With no success. Whhyyyyy

4
  • Will you please add a sample of your expected output? Commented Dec 23, 2021 at 1:12
  • It's pretty clear from the example, right? Commented Dec 23, 2021 at 1:25
  • So for 0 in val, do you want the 0th array of z, and for 3 in val, you want the 3rd item from z? Commented Dec 23, 2021 at 1:30
  • See my very simple answer below... Commented Dec 23, 2021 at 1:55

2 Answers 2

1

It is cause by the object data type we have a way with fillna

df.val.fillna(dict(zip(df.index[df['val'].isnull()],[z]*df['val'].isnull().sum())),inplace=True)
df
                                                 val
0  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
1                                                2.0
2                                                3.0
3  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
4                                                5.0
Sign up to request clarification or add additional context in comments.

5 Comments

Almost, but this replaces values with an array of shape (10,). I really need it to be replaced with z, which is shape (10,10) - a 2d array. Try df['val'][0].shape
@JmeCS check the update
Many thanks! This is gnarly. Really unclear to my why it drops the array structure to a scalar in the first place...
@JmeCS check my answer. It's simpler, and it might work for you.
@richardec I upvoted it but it doesn't work for my real-world problem unfortunately. Not clear why. It's too complicated to replicate here but your solution definitely works for the dummy problem.
1

You were really close. Change the dataframe's dtype to object and change = z to = [z]:

df = df.astype(object)
df.loc[df['val'].isna(), 'val'] = [z]

Output:

>>> df
                                                 val
0  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
1                                                2.0
2                                                3.0
3  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
4                                                5.0

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.