Pandas/Python Combine two data frames with duplicate rows

Question

Ok this seems like it should be easy to do with merge or concatenate operations but I can't crack it. I'm working in pandas.

I have two dataframes with duplicate rows in between them and I want to combine them in a manner where no rows or columns are duplicated. It would work like this

df1:

A B 
a 1
b 2
c 3

df2:

A B 
b 2
c 3
d 4

df3 = df1 combined with df2

A B 
a 1
b 2
c 3
d 4

Some methods I've tried are to select the rows that are in one but not the other (an XOR) and then append them, but I can't figure out how to do the selection. The other idea I have is to append them and them delete duplicate rows, but I don't know how to do the latter.

EdChum · Accepted Answer · 2015-06-18 09:17:31Z

6

You want an outer merge:

In [103]:
df1.merge(df2, how='outer')

Out[103]:
   A  B
0  a  1
1  b  2
2  c  3
3  d  4

The above works as it naturally finds common columns between both dfs and specifying the merge type results in a df with a union of the combined columns as desired.

edited Jun 18, 2015 at 9:17

answered Jun 18, 2015 at 9:10

EdChum

396k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Adam Over a year ago

What if you have some rows that are duplicates and some that aren't and based on index you want to keep instances in df1, and drop repeat indexes in df2 (should this be a new question)

PirateApp Over a year ago

what if you want to merge such that values from df2 overwrite same values from df1

Georg Plaz · Accepted Answer · 2017-11-09 16:28:52Z

2

You can use the following to drop the duplicates:

pd.concat([df1, df2]).drop_duplicates()

edited Nov 9, 2017 at 16:28

Georg Plaz

6,0085 gold badges43 silver badges66 bronze badges

answered Nov 9, 2017 at 16:07

forsaken

7192 gold badges13 silver badges33 bronze badges

Collectives™ on Stack Overflow

Pandas/Python Combine two data frames with duplicate rows

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related