0

I'm trying to subset a dataframe on the basis of conditions from multiple columns. Here is my dataframe.

var1 <- c(x,x,x,y,y,z,z,z,z) 
var2 <- c(a,b,c,a,b,a,b,c,d) 
var3 <- c(2,4,1,4,1,6,2,5,8)
data1 <- data.frame(var1,var2,var3)
# -------------------------------------------------------------------------
#     var1 var2 var3
# 1    x    a    2
# 2    x    b    4
# 3    x    c    1
# 4    y    a    4
# 5    y    b    1
# 6    z    a    6
# 7    z    b    2
# 8    z    c    5
# 9    z    d    8

Output

The output I expect is:

#     var1
# 1    y
# 2    z

Condition

The following are the conditions leading to the output:

  1. The output is a dataframe where only values of var1 are selected.
  2. Values of var3 where var2 is equal to a is greater than values of var3 where var2 is equal to b.

I'm unable to create a code based on this complicated condition from multiple columns.

Thank you.

2
  • 1
    Please add the expected output, and your attempts to solve this problem. Commented Sep 21, 2019 at 9:11
  • stackoverflow.com/questions/5963269/… Commented Sep 21, 2019 at 9:11

3 Answers 3

1

This can give you a factor:

subset(data1, (var2=="a"))[subset(data1, (var2=="a"))$var3 > subset(data1, (var2=="b"))$var3, "var1"]

# [1] y z
# Levels: x y z

You can use data.frame to get what you want as follows:

data.frame(var1 = subset(data1, (var2=="a"))[subset(data1, (var2=="a"))$var3 > subset(data1, (var2=="b"))$var3, "var1"])
#   var1
# 1    y
# 2    z
Sign up to request clarification or add additional context in comments.

Comments

1

The most intuitive solution might be to use a for-loop. Probably, there are shorter and more elegant ways to solve this problem, but this should work:

selection <- c()

for(i in unique(var1)) {
  var_store <- data1 %>%
    filter(var1 == i, var2 == a | var2 == b)

  if(filter(var_store, var2 == a) %>% 
    select(var3) %>% 
    as.numeric() > 
  filter(var_store, var2 == b) %>% 
    select(var3) %>% 
    as.numeric()) {

    selection <- c(selection , unique(var_store$var1))
  }
}

data1 %>% 
  filter(var1 %in% selection)


# # A tibble: 6 x 3
#   var1  var2   var3
#   <chr> <chr> <dbl>
# 1 y     a         4
# 2 y     b         1
# 3 z     a         6
# 4 z     b         2
# 5 z     c         5
# 6 z     d         8

2 Comments

I have been able to get the desired answer by transposing the dataframe using dcast()
@sayandesarkar, in that case, you can answer your own question and accept it as an answer.
0

I found that reshaping the dataframe can solve my problem. I have been transposed var2 using dcast() to get the desired result

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.