Subsetting multiple dataframes within list in R based on strings in another dataframe

Question

I am trying to subset multiple dataframes that are contained in a list based on strings that are contained in another dataframe.

list.df <- list(
 df.1 = data.frame(LM = c(1:10), LS = c(1:10), PL = c(1:10)), 
 df.2 = data.frame(XY = c(1:10), FE = c(4:13), OI = c(1:10)), 
 df.3 = data.frame(IL = c(1:10), KU = c(9:18), TS = c(1:10)))

df.4 <- data.frame(df.1 = c("LM", "PL", NA), df.2 = c("FE", NA, NA), 
 df.3 = c("IL", "KU", "TS"))

I want all my dataframes to look like this in the end:

df.1_sub <- subset(list.df[["df.1"]], select = 
   colnames(list.df[["df.1"]]) %in% df.4$df.1)

I will have to do this for around 50 datasets and was wondering whether there was a way of writing a loop to do this for all the datasets at once.

I have tried using lapply and for loops but was so far unsuccessful. I am new to using lists in R and would appreciate any help! This is my first time posting on stack overflow so please let me know if my post isn't appropriate,

Just to clarify, if you created df.2_sub it would just be the FE column, correct? And df.3_sub would be a 10x3 dataframe consisting of columns IL, KU, and TS? — sumshyftw
– sumshyftw, Commented Jun 5, 2019 at 23:20

Ronak Shah · Accepted Answer · 2019-06-05 23:33:02Z

4

One way using Map would be to remove NA values from df.4 and subset the respective columns from list.df

Map(function(x, y) x[as.character(na.omit(y))], list.df, df.4)

#$df.1
#   LM PL
#1   1  1
#2   2  2
#3   3  3
#4   4  4
#5   5  5
#6   6  6
#7   7  7
#8   8  8
#9   9  9
#10 10 10

#$df.2
#   FE
#1   4
#2   5
#3   6
#4   7
#5   8
#6   9
#7  10
#8  11
#9  12
#10 13

#$df.3
#   IL KU TS
#1   1  9  1
#2   2 10  2
#3   3 11  3
#.....

The same can be achieved using purrr::map2

purrr::map2(list.df, df.4, ~.x[na.omit(as.character(.y))])

edited Jun 5, 2019 at 23:33

answered Jun 5, 2019 at 23:29

Ronak Shah

391k20 gold badges173 silver badges237 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

thelatemail Over a year ago

Too fast for me. Map instead of mapply probably makes more sense since you aren't simplifying the result though.

thelatemail Over a year ago

it's a pity df.4 has factor columns or you could collapse significantly - Map(`[`, list.df, lapply(df.4, na.omit)) - which unfortunately gives the wrong answer currently.

Ricarda Over a year ago

Thank you so much for your replies! I have tried the above and it works fine on the example but when I try to do it for my actual data I get this error Error: Can't find columns AD, AB, AW, AC, AL, ... (and 32 more) in .data`. I have check manually, an these columns are definitely in one of the dataframes in the list. Any ideas?

Ronak Shah Over a year ago

@Ricarda This is not working for all the dataframes in the list. This subsets first column of df.4 with first list in list.df[[1]], second column of df.4 is subsetted with list.df[[2]] and so on. Are you trying to subset it from entire list.df ?

Ricarda Over a year ago

@Ronak, Thanks! I just realised that the order of the columns in df.4 isn't the same as the order of the dfs in list.df. Basically, there is one column in df.4 that corresponds to one of the dfs in the list.That column and the dataframe have the same name.

|

akrun · Accepted Answer · 2019-06-06 07:44:52Z

We can use complete.cases with Map

Map(function(x, y) x[complete.cases(y)], list.df, df.4)
#$df.1
#   LM LS
#1   1  1
#2   2  2
#3   3  3
#4   4  4
#5   5  5
#6   6  6
#7   7  7
#8   8  8
#9   9  9
#10 10 10

#$df.2
#   XY
#1   1
#2   2
#3   3
#4   4
#5   5
#6   6
#7   7
#8   8
#9   9
#10 10

#$df.3
#   IL KU TS
#1   1  9  1
#2   2 10  2
#3   3 11  3
#4   4 12  4
#5   5 13  5
#6   6 14  6
#7   7 15  7
#8   8 16  8
#9   9 17  9
#10 10 18 10

Or using pmap

library(purrr)  
pmap(list(list.df, df.4), ~ .x[complete.cases(.y)])

Collectives™ on Stack Overflow

Subsetting multiple dataframes within list in R based on strings in another dataframe

2 Answers 2

6 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Related