Remove both rows that duplicate in R

Question

I'm trying to remove all rows that have a duplicate value. Hence, in the example I want to remove both rows that have a 2 and the three rows that have 6 under the x column. I have tried df[!duplicated(xy$x), ] however this still gives me the first row that duplicates, where I do not want either row.

    x <- c(1,2,2,4,5,6,6,6)
    y <- c(1888,1999,2000,2001,2004,2005,2010,2011)
    xy <- as.data.frame(cbind(x,y))
    xy
    x    y
  1 1 1888
  2 2 1999
  3 2 2000
  4 4 2001
  5 5 2004
  6 6 2005
  7 6 2010
  8 6 2011

What I want is

Any help is appreciated. I need to avoid specifying the value to get rid of since I am dealing with a dataframe with thousands of records.

xy[!(duplicated(xy$x)|duplicated(xy$x, fromLast = TRUE)), ] — user20650
– user20650, Commented Mar 6, 2016 at 21:52

A. Webb · Accepted Answer · 2016-03-06 21:43:55Z

3

You can count and include only the singletons

xy[1==ave(xy$x,xy$x,FUN=length),]

answered Mar 6, 2016 at 21:43

A. Webb

26.5k1 gold badge67 silver badges97 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

SymbolixAU · Accepted Answer · 2016-03-06 21:42:29Z

2

we can do

xy[! xy$x %in% unique(xy[duplicated(xy$x), "x"]), ]
#  x    y
#1 1 1888
#4 4 2001
#5 5 2004

as

unique(xy[duplicated(xy$x), "x"])

gives the values of x that are duplicated. Then we can just filter those out.

answered Mar 6, 2016 at 21:42

SymbolixAU

26.3k4 gold badges72 silver badges148 bronze badges

Comments

DatamineR · Accepted Answer · 2016-03-06 21:49:40Z

2

Or like this:

xy[xy$x %in% names(which(table(xy$x)==1)),]
  x    y
1 1 1888
4 4 2001
5 5 2004

answered Mar 6, 2016 at 21:49

DatamineR

9,6603 gold badges28 silver badges50 bronze badges

Collectives™ on Stack Overflow

Remove both rows that duplicate in R

3 Answers 3

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Linked

Related