1

I have a data frame that looks like this:

structure(list(A = c(70, 70, 70, 70, 70, 70), T = c(0.1, 0.2, 
0.3, 0.4, 0.5, 0.6), X = c(434.01, 434.01, 434.75, 434.75, 434.75, 
434.01), Y = c(454.92, 454.92, 454.92, 454.92, 454.18, 454.92
), V = c(0, 0, 21.128, 0, 14.94, 14.94), thetarad = c(0.151841552716899, 
0.151841552716899, 0.150990672182432, 0.150990672182432, 0.150177486839524, 
0.151841552716899), thetadeg = c(8.69988012340509, 8.69988012340509, 
8.6511282599214, 8.6511282599214, 8.6045361718215, 8.69988012340509
)), .Names = c("A", "T", "X", "Y", "V", "thetarad", "thetadeg"
), row.names = 1423:1428, class = "data.frame")

I want to subset specific time points in R with intervals of 30 sec. I can do this by manually subsetting each time point that I want:

a1=subset(binA, T==0.1)
a2=subset(binA, T==30)
a3=subset(binA, T==60)
a4=subset(binA, T==90)
a5=subset(binA, T==120)
a6=subset(binA, T==150)
a7=subset(binA, T==180)
a8=subset(binA, T==210)
a9=subset(binA, T==240)
a10=subset(binA, T==270)
a11=subset(binA, T==300)
a12=subset(binA, T==330)
a13=subset(binA, T==360)
a14=subset(binA, T==390)
a15=subset(binA, T==420)
a16=subset(binA, T==450)
a17=subset(binA, T==480)
a18=subset(binA, T==510)
a19=subset(binA, T==540)
a20=subset(binA, T==570)
a21=subset(binA, T==599.5)

I tried subsetting using sapplyand the seq function but got confusing results. I also want to count the unique A in each subset of data. I also know I can do this using the count function in plyrpackage.

a1=count(unique(subset(binA, T==0.1)))

but count will work with one data frame and not multiple ones (correct me if I am wrong). I also want to take the means of thetadeg for each subset (this should be easy for sapply in one data frame only). So I need help on how to write a function with specific seq points.

I know this problem is trivial but help would be appreciated.

Thanks

1
  • Perhaps cut or findInterval would be of use here.... Commented Nov 14, 2013 at 16:42

4 Answers 4

1

Assuming data is in df data frame then, try this:

sapply(c(0.1,seq(30,599,30),599.5),
       function(x)
         length(unique(df[ df$T==x, "A"])))
Sign up to request clarification or add additional context in comments.

1 Comment

Yes! I just did not know how to input the seq. Now I can also get the mean by changing the last part to mean(unique(binA[ binA$T==x, "thetadeg"])))THANKS!
0

You should be able to use the following code to get what you want. This doesn't look for 0.1 and 599.5 but that should be easy to manipulate.

timeintervals <- seq(0,600, 30)
for(i in 1:length(timeintervals)
{
  # create the subsets for each time interval
  assign(
    paste0("a",i),
    df[df$T == timeintervals[i],]
    )

  # get all unique As
  assign(
    paste0("b",i),
    unique(df[df$T == timeintervals[i],"A"])
  )

}

Comments

0

If purpose is just to get average, unique count etc, you don't need to subset.and one more thing, id T factor is is continuous and you need to make the bins? here I am assuming factor

here is one approach with plyr

ddply(df,~T,summarise,l=length(unique((A))))
ddply(df,~T,summarise,m=mean(thetadeg))

1 Comment

I need my data to be on intervals not on all time points. Thanks!
0

The function I think you want is split:

 subsetted.by.T <- split(dfrm, dfrm$T)
lapply(subsetted.by.T, nrow)

$`0.1`
[1] 1

$`0.2`
[1] 1

$`0.3`
[1] 1

$`0.4`
[1] 1

$`0.5`
[1] 1

$`0.6`
[1] 1

> subsetted.by.T[[1]]
      A   T      X      Y V  thetarad thetadeg
1423 70 0.1 434.01 454.92 0 0.1518416  8.69988

If you want to name these individual items, then the names<- function would be appropriate:

names(subsetted.by.T) <- paste0("a", seq(length(subsetted.by.T) ) )

If the "T" column were somewhat irregular in its values, then perhaps using cut to create categories at regular breaks would be useful for the purpose of splitting. The question might be clarified if "T" were actually a time value. At the moment it's a "numeric" value, but there are cut methods for datetime classes.

2 Comments

I know split and I thought of using it but I need my data to be on intervals not on all time points. Thanks!
That means you should use 'cut'.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.