I want to write a function for a very repetitive action. The data looks like this
id<-c(100,104,999,225,350,450)
sex<-c('female','male','male','female','male','male')
race<-c('black','white','white','white','black','white')
class<-c('a','a','c','b','c','b')
adur<-c(3,3,15,3,3,59)
bdur<-c(2,59,26,59,2,14)
cdur<-c(1,59,59,59,59,1)
ae<-c(1,1,1,1,1,0)
be<-c(1,0,1,0,1,1)
ce<-c(1,0,0,0,1,1)
mydata<-data.frame(id,sex,race,class,adur,bdur,cdur,ae,be,ce)
id sex race class adur bdur cdur ae be ce
1 100 female black a 3 2 1 1 1 1
2 104 male white a 3 59 59 1 0 0
3 999 male white c 15 26 59 1 1 0
4 225 female white b 3 59 59 1 0 0
5 350 male black c 3 2 59 1 1 1
6 450 male white b 59 14 1 0 1 1
I want to group by different variables (sex,race,class) and do some calculations. This is my attempt.
stp_f<-function(ivar,idur,ie){
x<-mydata %>% group_by(ivar) %>% summarise(sumdur=sum(idur),
sumev=sum(ie),
failrate=sumev/sumdur) %>%
rename(var=ivar)
}
stp_f(sex,adur,ae)
stp_f(sex,bdur,be)
stp_f(sex,cdur,ce)
It doesn't work because I think R doesn't read variables this way. I have been suggested to abandon tidyverse and use data.table instead, but because I am not familiar with data.table syntax I find it hard to wrap my head around. Can someone explain this in detail in data.table or use dplyr grammar for this function?