I'm a beginner in Python. I have two dataframes, each with 5 columns but only the first two columns from each dataframe have matching data. Each dataframe have different number of records. I would like to compare column A from df1 against column A from df2 and if they match, then output column D (ownerEmail) from df2. If columns A don't match, column D should be null.
df1
subscriptionId | displayName | state | authorization | tenantId
12345 | DEV_SPS | Enabled | RoleBased | 938c49a8
67890 | PROD_LINUX | Enabled | RoleBased | 0a9cb9ee
11900 | TST_WIN | Enabled | RoleBased | e1513511
df2
subscriptionId | SubName | Connected | ownerEmail | organization
12345 | DEV_SPS | Enabled | [email protected] | Marketing
67890 | PROD_LINUX | Enabled | [email protected] | Sales
Desired output
subscriptionId | displayName | state | authorization | tenantId | ownerEmail
123456 | DEV_SPS | Enabled | RoleBased | 938c49a8 | [email protected]
67890 | PROD_LINUX | Enabled | RoleBased | 0a9cb9ee | [email protected]
11900 | TST_WIN | Enabled | RoleBased | e1513511 | null
I have tried something like this but it didn't work.
df1['ownerEmail'] = np.where(df1['subscriptionId'] == df2['subscriptionId'], ['ownerEmail'], "")
print(df1)
Any help would be much appreciated.
Thank you.