pandas drop duplicates doesn't return dataframe with duplicates removed

Question

I have a dataframe:

df = pd.Dataframe({'src':['A','B','C'],'trg':['A','C','B'],'wgt':[1,3,7]})

I want to drop the duplicates from this dataframe for columns src and trg

df = df.drop_duplicates(subset=['src','trg'],keep='first',inplace=False)

This should drop the first row where src=A and trg='A'

But this is not happening. There is no change in the dataframe. What am I doing wrong ?

That worked and removed all the duplicates without keeping at least one pair. But can you suggest why drop_duplicates is not working? — Ashutosh
– Ashutosh, Commented Mar 1, 2020 at 19:40
That's because, in both rows, the values of src and trg are not the same. When you use the subset, it looks for duplicates in the entire subset. — Mohit Motwani
– Mohit Motwani, Commented Mar 1, 2020 at 19:41
Drop_duplicates works for columns. That is if you have another row with B C as source and target, that row will be dropped. — Quang Hoang
– Quang Hoang, Commented Mar 1, 2020 at 19:42

Junior Yao · Accepted Answer · 2020-03-01 20:30:01Z

1

TO remove the duplicate, you can refer to the following example which I have solved on pyNb

Or use df = df[df['src'] != df['trg']]

answered Mar 1, 2020 at 20:30

Junior Yao

315 bronze badges

Sign up to request clarification or add additional context in comments.

1 Answer 1