1

I have a large dataset that requires merging, however I am unsure how to get my desired output

here is an example of what i have done:

df1 = pd.DataFrame({'identity': ['A','A','A'], 'Type': ['D','E','F'], 'count_df1': [7,8,9]})
df2 = pd.DataFrame({'identity': ['A'], 'Type':[ 'D'],'Name':['ABC co'],'count_df2':[5]})
    

merged = df1.merge(df2,on = ['identity','Type'],how ='inner')

I need to merge on identity and Type

output:

identity    Type    Name    count_df2
   A          D     ABC co     5

I have also tried outer join,

  identity  Type        count_df1   Name_y  count_df2
0   A         D            7         ABC co   5.0
1   A         E            8         NaN      NaN
2   A         F            9         NaN      NaN

----------What i hope to get -------------
    identity    Type     Name        count_df1    count_df2
        A         D     ABC co         7           5
        A         E     ABC co         8           0
        A         F     ABC co         9           0  

Please help ! Thank you very much

1
  • Thanks for the fast response guys ! Commented Nov 13, 2020 at 11:57

3 Answers 3

1
df1.merge(df2, on=['identity','Type'], how='outer').fillna({"count_df2": 0, "Name": "ABC co"})

outer join + fillna missing values by column.

Sign up to request clarification or add additional context in comments.

Comments

0

It seems that you want your NaNs to be zeros, which you can accomplish by adding a .fillna(0) at the end.

merged = df1.merge(df2,on = ['identity','Type'],how ='outer').fillna(0)

Comments

0

Ah okay I see so basically your outer join works as I see it. the only problem that I see are the Nan values right ? I would suggest to use fillNaon the result https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.