0

I am trying to write a for loop that first checks if it is a data frame. If it is a data frame then iterates over the data frames and calculates the mean and then creates a new column with the mean value. Here is an example:

df1 <- data.frame(
  Number = c(45,62,27,34,37,55,40))
df2 <- data.frame(
  Number = c(15,20,32,21,17,18,13))
df3 <- data.frame(
  Number = c(12,32,22,14,16,21,30))

L <- list(df1,df2,df3)

for(i in L){if(is.data.frame(i)){
  i$Average <- mean(i)
}
}

and an example of the result I am after for df1 would be:

 Number  Average
1     45 42.85714
2     62 42.85714
3     27 42.85714
4     34 42.85714
5     37 42.85714
6     55 42.85714
7     40 42.85714

Thanks!

3 Answers 3

2

If we need to update the original data.frame objects with the new value, then use assign

nm1 <- paste0("df", 1:3)
for(i in seq_along(L)) {
    assign(nm1[i], `[<-`(L[[i]], "Average", value = mean(L[[i]]$Number)))
 }   

df1
#  Number  Average
#1     45 42.85714
#2     62 42.85714
#3     27 42.85714
#4     34 42.85714
#5     37 42.85714
#6     55 42.85714
#7     40 42.85714

Regarding why the OP's loop didn't work,

for(i in L) print(i)

returns the value of the list and not the names of the objects. So, we cannot an assignment i$Average <-. The list elements don't have names. Also, mean works on a vector. It can be directly applied on data.frame

mean(L[[1]])
#[1] NA

Warning message: In mean.default(L[[1]]) : argument is not numeric or logical: returning NA

mean(L[[1]]$Number)
#[1] 42.85714

In the for loop, it means we get NAs

for(i in L) mean(i)
#  Warning messages:
#1: In mean.default(i) : argument is not numeric or logical: returning NA
#2: In mean.default(i) : argument is not numeric or logical: returning NA
#3: In mean.default(i) : argument is not numeric or logical: returning NA

Once, we extract the column 'Number', the mean works

for(i in L) print(mean(i$Number))
#[1] 42.85714
#[1] 19.42857
#[1] 21

But, it is easier to keep it in the list and update the datasets in the list. Use lapply to create a column 'Average' by looping over the list and getting the mean of the 'Number'

lapply(L, transform, Average = mean(Number))

Or with tidyverse

library(tidyverse)
L %>%
   map(~ .x %>%
            mutate(Average = mean(Number)))
Sign up to request clarification or add additional context in comments.

Comments

1

i will only be a temporary object used to control your for loop. To make changes to the dataframes stored in L outside of the loop try indexing by number like this.

df1 <- data.frame(Number = c(45,62,27,34,37,55,40))
df2 <- data.frame(Number = c(15,20,32,21,17,18,13))
df3 <- data.frame(Number = c(12,32,22,14,16,21,30))

L <- list(df1,df2,df3)

for(i in 1:length(L)){if(is.data.frame(L[[i]])){

## Requires explicitly extracting the values in 
## L[[i]] by name.  So could be problematic if you actually
## have many columns in your dataframes.  
L[[i]]$Average <- mean(L[[i]]$Number)
}
}

Comments

1

You may use purrr::map to do this:

require(purrr)

L %>%
  map(
    .f =
      ~ .x %>%
      {
        if (is.data.frame(.)) {
          mutate(., Average = mean(Number))
        }
      }
  )

# [[1]]
#   Number  Average
# 1     45 42.85714
# 2     62 42.85714
# 3     27 42.85714
# 4     34 42.85714
# 5     37 42.85714
# 6     55 42.85714
# 7     40 42.85714
# 
# [[2]]
#   Number  Average
# 1     15 19.42857
# 2     20 19.42857
# 3     32 19.42857
# 4     21 19.42857
# 5     17 19.42857
# 6     18 19.42857
# 7     13 19.42857

# [[3]]
#   Number Average
# 1     12      21
# 2     32      21
# 3     22      21
# 4     14      21
# 5     16      21
# 6     21      21
# 7     30      21

1 Comment

Yep, it's just missing the is.data.frame() check.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.