4

I'm newbie to pandas, and trying to replace a column value (NaN) in df1 with df2 with column value match. And facing the following error.

df1
unique_col  |  Measure
944537          NaN
7811403         NaN 
8901242114307     1 

df2
unique_col  |  Measure
944537           18
7811403          12 
8901242114307    17.5



df1.loc[(df1.unique_col.isin(df2.unique_col) &
                       df1.Measure.isnull()), ['Measure']] = df2[['Measure']]

I have a two dataframes with 3 million records and on performing below operation facing the following error:

ValueError: cannot reindex from a duplicate axis

0

1 Answer 1

20

You way to easily fill nans is to use fillna function. In your case, if you have the dfs as (notice the indexes)

    unique_col      Measure
0   944537          NaN
1   7811403         NaN
2   8901242114307   1.0


    unique_col      Measure
0   944537          18.0
1   7811403         12.0
2   8901242114307   17.5

You can simply

>>> df.fillna(df2)


    unique_col       Measure
0   944537           18.0
1   7811403          12.0
2   8901242114307    1.0

If indexes are not the same as the above, you can set them to be the same and use the same function

df = df.set_index('unique_col')
df.fillna(df2.set_index('unique_col'))
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.