4

I have to compare all the elements in the list with each other, dog vs cat, dog vs mouse, cat vs mouse.

animals <- list(
  dog = data.frame(
    col_1 = sample(c(100:200), size = 10, replace = TRUE),
    col_2 = sample(c(100:200), size = 10, replace = TRUE),
    col_3 = sample(c(100:200), size = 10, replace = TRUE),
    col_4 = sample(c(100:200), size = 10, replace = TRUE)
  ),
  cat = data.frame(
    col_1 = sample(c(100:200), size = 10, replace = TRUE),
    col_2 = sample(c(100:200), size = 10, replace = TRUE),
    col_3 = sample(c(100:200), size = 10, replace = TRUE),
    col_4 = sample(c(100:200), size = 10, replace = TRUE)
  ),
  mouse = data.frame(
    col_1 = sample(c(100:200), size = 10, replace = TRUE),
    col_2 = sample(c(100:200), size = 10, replace = TRUE),
    col_3 = sample(c(100:200), size = 10, replace = TRUE),
    col_4 = sample(c(100:200), size = 10, replace = TRUE)
   )
)

I want to compare every combination of two dataframes from the above. The type of comparison isn't important - it's sufficient to say this can be done via a generic_function() like so:

generic_function(animals$dog[, c("col_1", "col_2")],
                 animals$cat[, c("col_1", "col_2")])

Since there is a lot of data to compare, I would like to use lapply function, but I don't know how to create the intersections between the various levels of the list. Can you help me?

4
  • 1
    I think we need some more information. Do you want to compare every dataframe with every other dataframe, whilst also taking every possible combination of two columns from each dataframe? Commented Mar 14, 2024 at 15:40
  • Additionally {MsCoreUtils} doesn't seem to be on CRAN - please provide a link to the package. If the question is the same if you replace ndotproduct() with any function which accepts two dataframes, I'd suggest editing it to remove this detail. Commented Mar 14, 2024 at 15:41
  • 1
    @wurli not need every column combination, but only between specific column as in the example. And yes, this should work with any generic function that accept 2 dataframes. Commented Mar 14, 2024 at 15:45
  • combn(names(animals), 2L) and then iterate over the columns? Commented Mar 14, 2024 at 15:51

3 Answers 3

2

You can generate combinations with combn() and then iterate over combinations using lapply():

animals <- list(
  dog = data.frame(
    col_1 = sample(c(100:200), size = 10, replace = TRUE),
    col_2 = sample(c(100:200), size = 10, replace = TRUE),
    col_3 = sample(c(100:200), size = 10, replace = TRUE),
    col_4 = sample(c(100:200), size = 10, replace = TRUE)
  ),
  cat = data.frame(
    col_1 = sample(c(100:200), size = 10, replace = TRUE),
    col_2 = sample(c(100:200), size = 10, replace = TRUE),
    col_3 = sample(c(100:200), size = 10, replace = TRUE),
    col_4 = sample(c(100:200), size = 10, replace = TRUE)
  ),
  mouse = data.frame(
    col_1 = sample(c(100:200), size = 10, replace = TRUE),
    col_2 = sample(c(100:200), size = 10, replace = TRUE),
    col_3 = sample(c(100:200), size = 10, replace = TRUE),
    col_4 = sample(c(100:200), size = 10, replace = TRUE)
  )
)


combinations <- asplit(combn(names(animals), 2), 2)

print(combinations)
#> [[1]]
#> [1] "dog" "cat"
#> 
#> [[2]]
#> [1] "dog"   "mouse"
#> 
#> [[3]]
#> [1] "cat"   "mouse"

lapply(combinations, function(x) {
  generic_function(
    animals[[x[1]]][, c("col_1", "col_2")],
    animals[[x[2]]][, c("col_1", "col_2")]
  )
})

Created on 2024-03-14 with reprex v2.1.0

Sign up to request clarification or add additional context in comments.

Comments

1
n <- length(animals)
cols <- c("col_1", "col_2")
lapply(1:(n-1), \(i) lapply((i+1):n, \(j) generic_function(animals[[i]][cols], animals[[j]][cols]))) |> 
  unlist(recursive = FALSE)

Comments

1

You can use combn like below

combn(
   lapply(animals, `[`, c("col_1", "col_2")),
   2,
   FUN = \(x) do.call(generic_function, x),
   simplify = FALSE
)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.