python dataframe .duplicated returns multiple occurrences for same value

Question

Given a following dataframe:

import pandas as pd

df = pd.DataFrame({'month': [2, 2, 1, 1, 2, 10],
                   'year': [2017, 2017, 2020, 2020, 2018, 2019],
                   'sale': [60, 45, 90, 20, 28, 36],
                   'title': ['Ones', 'Twoes', 'Three', 'Four', 'Five', 'Six']})

I am trying to get duplicates in month columnn.

df[df.duplicated(subset=['month'])]

By default, keep="first"

But this is giving two occurrences for month 2.

   month  year  sale  title
1      2  2017    45  Twoes
3      1  2020    20   Four
4      2  2018    28   Five

I'm confused with the output. Am I missing something here?

the output is the duplicate values in your dataframe, not the values after dropping the duplicates. — Amit Gupta
– Amit Gupta, Commented Jul 6, 2021 at 7:13

jezrael · Accepted Answer · 2021-07-06 07:12:21Z

2

Ouput is filter all duplicates with remove first dupe.

If need first dupes only invert mask and chain mask for filter only dupes with keep=False parameter:

df1 = df[~df.duplicated(subset=['month']) & df.duplicated(subset=['month'], keep=False)]
print (df1)
   month  year  sale  title
0      2  2017    60   Ones
2      1  2020    90  Three

answered Jul 6, 2021 at 7:12

jezrael

867k102 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Amit Gupta · Accepted Answer · 2021-07-06 07:17:54Z

2

the output is the duplicate values in your dataframe, not the values after dropping the duplicates. if you want only the non duplicate values then

df.drop_duplicates(subset=['month'])

which will give you

  month  year   sale title
0   2   2017    60  Ones
2   1   2020    90  Three
5   10  2019    36  Six

you can use keep = ['first', 'last', 'None'] based on your requirement.

answered Jul 6, 2021 at 7:17

Amit Gupta

2,9784 gold badges31 silver badges41 bronze badges

Collectives™ on Stack Overflow

python dataframe .duplicated returns multiple occurrences for same value

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related