Map values from one DataFrame to another

Question

I have two DataFrames:

df - the core DataFrame with columns/cells that I want to expand
maptable - a maptable DataFrame that maps certain columns

An example:

maptable:

id | period
A  | winter
B  | summer
A  | summer
nan | summer
B  | nan

df:

id | period  | other_col
A  | None    | X
B  | summer  | Y
C  | None    | Z
D  | spring  | D
D  | NaN

How can I only map the cells in df that are None/empty/nan using the maptable and the identifier column id?

df['period'].fillna(df['id'].map(maptable.set_index('id')['period']))? — ansev
– ansev, Commented Feb 7, 2020 at 15:37

ansev · Accepted Answer · 2020-02-07 16:46:37Z

3

Use Series.map and then fill NaN with Series.fillna:

df['period']= df['period'].fillna(df['id'].map(maptable.set_index('id')['period']))   
#alternative
#df['period']= (df['id'].map(maptable.set_index('id')['period'])
#                       .where(df['period'].isnull(),df['period']))

Output

  id other_col  period
0  A         X  winter
1  B         Y  summer
2  C         Z     NaN
3  D         D  spring

EDIT DataFrame.merge

new_df= (df.merge(maptable,on = 'id',how = 'left')
           .assign(period = lambda x: x['period_x'].fillna(x['period_y']))
           .loc[:,df.columns])
print(new_df)
  id  period other_col
0  A  winter         X
1  A  summer         X
2  B  summer         Y
3  C     NaN         Z
4  D  spring         D

edited Feb 7, 2020 at 16:46

answered Feb 7, 2020 at 15:40

ansev

31k5 gold badges21 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

WJA Over a year ago

What is the difference with the alternative?

ansev Over a year ago

then we need merge

WJA Over a year ago

Does this also wrok with x['period_x'] as my col name might be dynamic

ansev Over a year ago

it is the same, you can select the suffix

WJA Over a year ago

I am receiving the error ['period'] not in index in my dataset

|

tdpr · Accepted Answer · 2020-02-07 15:38:56Z

# Creating your dataframes
maptable = pd.DataFrame([{"id":"A","period":"winter"},{"id":"B","period":"summer"}])
df = pd.DataFrame({"id":["A","B","C","D"], "period":[None, "summer", None, "spring"], "other_col":list('XYZD')})

# Merging both dataframes on the "id" key
df1 = pd.merge(left=df, right=maptable, on="id", how="left")
df1["period"] = [x if not pd.isnull(x) else y for x, y in zip(df1["period_x"], df1["period_y"])]
df1.drop(["period_x", "period_y"], axis=1, inplace=True)
print(df1)

Output:

  id other_col  period
0  A         X  winter
1  B         Y  summer
2  C         Z     NaN
3  D         D  spring

Collectives™ on Stack Overflow

Map values from one DataFrame to another

2 Answers 2

7 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

1 Comment

Linked

Related