Replacing values in dataframe with values from other

Question

Having two dataframes where one of them has some value to be replaced in the other. What is the best way to replace the values?

For instance, the type:none in df1 should be replaced with the value in df2. This is the progress I have done so far, but I am not content with this approach:

df1=pd.DataFrame({"word":['The','big','cat','house'], "type": ['article','none','noun','none'],"pos":[1,2,3,4]})
df2=pd.DataFrame({"word":['big','house'], "type": ['adjective','noun'],"pos":[2,4]})

df1.set_index('pos',inplace=True, drop=True)
df2.set_index('pos',inplace=True, drop=True)

for i, row in df1.iterrows():
    if row['type']=='none':
        row['word']=df2.loc[df2.index[i],'word']

df1 dataframe should change to:

   word   type         pos 
0 The      article       1
1 big       adjective  2
2 cat       noun         3
3 house  noun        4

Thanks :)

Check out my updated answer, i think it might be what your looking for — oppressionslayer
– oppressionslayer, Commented Dec 5, 2019 at 0:30

Henry Yik · Accepted Answer · 2019-12-05 02:02:19Z

1

If df2 always indicate the position of where the words in df1 should be replaced, you can simply do:

df1.loc[df2.index,"type"] = df2["type"]

print (df1)

#
      word       type
pos                  
1      The    article
2      big  adjective
3      cat       noun
4    house       noun

answered Dec 5, 2019 at 2:02

Henry Yik

22.6k5 gold badges21 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

JarochoEngineer Over a year ago

Thanks Henry, I was looking for something like this, the simplest approach.

CypherX · Accepted Answer · 2019-12-05 00:35:36Z

1

Solution

Without any use of .apply() method.

condition = df1['type']=='none'
df1.loc[condition, 'type'] = df2.loc[condition]['type']
df1.reset_index(inplace=True)

Output:

   pos   word       type
0    1    The    article
1    2    big  adjective
2    3    cat       noun
3    4  house       noun

answered Dec 5, 2019 at 0:35

CypherX

7,4034 gold badges29 silver badges39 bronze badges

2 Comments

CypherX Over a year ago

@JuanPerez Please try it out and leave a comment if it worked.

JarochoEngineer Over a year ago

Thanks @CypherX, it worked as a charm. Thank you :)

oppressionslayer · Accepted Answer · 2019-12-05 00:41:02Z

How about:

df= df2.set_index('word').combine_first(df1.set_index('word')) 
df.pos = df.pos.astype(int)

output:

            type  pos
word                 
The      article  1
big    adjective  2
cat         noun  3
house       noun  4

and

df.reset_index()

In [970]: df.reset_index()                                                                                                                                                                                 
Out[970]: 
    word       type  pos
0    The    article    1
1    big  adjective    2
2    cat       noun    3
3  house       noun    4

or by 'pos':

df = df2.set_index('pos').combine_first(df1.set_index('pos')).reset_index()
colidx=['word', 'type', 'pos']   
df.reindex(columns=colidx)

output:

Out[976]: 
    word       type  pos
0    The    article    1
1    big  adjective    2
2    cat       noun    3
3  house       noun    4

I would prefer to set index to position because it would be more than one word that is repeated, so the position can differenciate these cases of the same word in dataframe.

Collectives™ on Stack Overflow

Replacing values in dataframe with values from other

3 Answers 3

1 Comment

Solution

2 Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Solution

2 Comments

2 Comments

Related