1

For the last 3 hours I am trying to vectorize some piece of code. The idea is to loop over a matrix and compare all the values with the mean over the columns. If the values are larger, set them to 999.

comparevalues <- function(y){
  x <- apply(y,2, function(y) mean(y,na.rm=T))    
  for (j in 1:ncol(y)){
    for (i in 1:nrow(y)){
      if (!is.na(y[i,j]) & y[i,j] > x[j]) y[i,j] <- 999
    }
  }
  return(y)
}

Testing, e.g. with:

m1 <- matrix(c(1:3,NA,2.4,2.8,3.9,0,1,3,0,2,1.3,2,NA,7,3.9,2.4),6,3)
comparevalues(m1)

results in:

      [,1] [,2]  [,3]
[1,]    1  999   1.3
[2,]    2    0   2.0
[3,]  999    1    NA
[4,]   NA  999 999.0
[5,]  999    0 999.0
[6,]  999  999   2.4

My question is:

1) Can this kind of structure be vectorized, and if so, how can it be done?

2) I am trying to use apply and similar functions in this context. Likely there are different ways to solve this, but for learning purpose, I'd appreciate if someone could address apply as well. However if there are better ways I'd like to know them too.

4
  • 1
    What language is that? Commented Oct 28, 2015 at 13:26
  • 1
    Probably slower but more memory efficient solution would be indx <- apply(m1, 2, function(x) x > mean(x, na.rm = TRUE)) ; m1[indx] <- 999 Commented Oct 28, 2015 at 13:51
  • @DavidArenburg works perfectly fine, ty. Commented Oct 28, 2015 at 14:08
  • 1
    @akrun You are welcome to close even 50. SO will thank you and so do I. Commented Oct 28, 2015 at 14:11

1 Answer 1

1

We use the colMeans to get the mean of the columns of 'm1', replicate it using col(m1), check whether the 'm1' is greater than those value to get a logical matrix, extract the elements of 'm1' using that and assign it to 999.

m1[m1 >colMeans(m1, na.rm=TRUE)[col(m1)]] <- 999
m1
#     [,1] [,2]  [,3]
#[1,]    1  999   1.3
#[2,]    2    0   2.0
#[3,]  999    1    NA
#[4,]   NA  999 999.0
#[5,]  999    0 999.0
#[6,]  999  999   2.4
Sign up to request clarification or add additional context in comments.

10 Comments

@akrun I get NAs are not allowed in subscripted assignments when I try to use your code. Also maybe could you show me how to loop over a matrix using apply and referring to indices as well (alternative solution)?
@StevenBeaupré Yes, there was a mixup when I was testing and copying it.
It works fine now. I'd give you credit already, but can you please refer to my last question in the comments (or the 2nd part of my main post)?
@EDC Looks like you got an answer from DavidArenburg for that.
ok but isn't that essentially the same as you did? It is perfectly fine, vectorization applied, correct result. However my question was also in a broader context how to replace nested-for loops with the apply structure. Apply only loops over columns or rows, so I'd assume you need 2 x apply to replace a nested for loop? Some words with regard to this issue would be nice.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.