0

I want to summarise a df by group using multiple functions. Replication data:

library(dplyr)
df1 <- data.frame(a=c('a', 'a', 'b', 'b', 'c', 'c'), b=c(1,NA,3,2,2,1), c=c(1,3,5,5,2,4))

One of these is a custom function that asks for the value of df1$b when max(df1$c) in each group (df1$a). When the result is NA, it should return the value for df1$b for the second-highest value of df1$c. The following works:

namax <- function(x,y) ifelse(is.na(y[x==max(x)] & length(x)>1),
                              y[x==sort(x,partial=length(x)-1)[length(x)-1]], y[x==max(x)])

I then try to summarise df1 using:

df2 <- df1 %>%
  dplyr::group_by(a) %>%
  summarise(meanc = mean(c),
            maxc = namax(c,b))

Which returns the following, because for df$a == 'b' the max value of df1$c occurs twice for different values of df1$b.

Error: Column 'maxc' must be length 1 (a summary value), not 2

Is there an elegant solution through which dplyr returns both values, while simultaneously executing the other call to summarise() (e.g. by adding do() to the call to group_by)? In my applied case I am trying to run several different calls to summarise, aside from the one using the namax function.

0

1 Answer 1

1

You can put the values in a list, i.e.

library(dplyr)

 df1 %>%
     group_by(a) %>%
     summarise(meanc = mean(c),
               maxc = list(namax(c, b)))

# A tibble: 3 x 3
#  a     meanc maxc     
#  <fct> <dbl> <list>   
#1 a         2 <dbl [1]>
#2 b         5 <dbl [2]>
#3 c         3 <dbl [1]>

You can use unnest() to expand,

df1 %>%
     group_by(a) %>%
     summarise(meanc = mean(c),
               maxc = list(namax(c, b))) %>% 
     unnest()

# A tibble: 4 x 3
#  a     meanc  maxc
#  <fct> <dbl> <dbl>
#1 a         2     1
#2 b         5     3
#3 b         5     2
#4 c         3     1
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.