2

What I want is to create 60 data frames with 500 rows in each. I tried the below code and, while I get no errors, I am not getting the data frames. However, when I do a View on the as.data.frame, I get the view, but no data frame in my environment. I've been trying for three days with various versions of this code:

getDS <- function(x){
  for(i in 1:3){
    for(j in 1:30000){
      ID_i <- data.table(x$ID[j: (j+500)])
    }
  }
  as.data.frame(ID_i)
}

getDS(DATASETNAME)
1
  • If it is 60 dataframes, the j index should be different. Commented Nov 23, 2018 at 17:58

2 Answers 2

2

We can use outer (on a small example)

out1 <- c(outer(1:3, 1:3, Vectorize(function(i, j) list(x$ID[j:(j + 5)]))))
lapply(out1, as.data.table)

--

The issue in the OP's function is that inside the loop, the ID_i gets updated each time i.e. it is not stored. Inorder to do that we can initialize a list and then store it

getDS <- function(x) {
      ID_i <- vector('list', 3)
      for(i in 1:3) {
           for(j in 1:3) {
           ID_i[[i]][[j]] <- data.table(x$ID[j:(j + 5)])
          }
        }
      ID_i
    }

do.call(c, getDS(x))       

data

x <- data.table(ID = 1:50)
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you so much, @akrun!!
I'm getting the lists, but not sure how to unpack them into different dataframes.
@Divs. It is better to keep it in a list instead of creating different objects in the global environment. But, if you want to have different objects (not recommended). lst1 <- do.call(c, getDS(x)); names(lst1) <- paste0("data", seq_along(lst1)); list2env(lst1, envir = .GlobalEnv). Now you can check data1, data2 on your global env. But, as I said, it is easier to work with lists of data.frame
Thanks @akrun. The reason I want them as dataframes is because I need to write.csv() on them.
1

I'm not sure the description matches the code, so I'm a little unsure what the desired result is. That said, it is usually not helpful to split a data.table because the built-in by-processing makes it unnecessary. If for some reason you do want to split into a list of data.tables you might consider something along the lines of

getDS <- function(x, n=5, size = nrow(x)/n, column = "ID", reps = 3) {
    x <- x[1:(n*size), ..column]
    index <- rep(1:n, each = size) 
    replicate(reps, split(x, index),
              simplify = FALSE)
}

getDS(data.table(ID = 1:20), n = 5)

2 Comments

Thanks @Ista. I'm getting an error about column not being found: Error in eval(expr, envir, enclos) : object '..column' not found
@Divs you probably have an old version of data.table. Update to the latest release (currently 1.11.8).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.