Merge lists with a dataframe

Question

I have a list of lists like out. In each list I have a dataframe (with the same structure, i.e. same dimensions and variable names (id/period/pred_dif):

id <- c(1,2,3,4,5,1,2,3,4,5)
period <- c(01,09,12,01,08, 02,08,12,11,12)
pred_dif <- c(0.5,0.1,0.15,0.23,0.75,0.6,0.49,0.81,0.37,0.14)

list_1 <- data.frame(id, period, pred_dif)

pred_dif <- c(0.45,0.18,0.35,0.63,0.25,0.63,0.29,0.11,0.17,0.24)

list_2 <- data.frame(id, period, pred_dif)

pred_dif <- c(0.58,0.13,0.55,0.13,0.76,0.3,0.29,0.81,0.27,0.04)

list_3 <- data.frame(id, period, pred_dif)

pred_dif <- c(0.3,0.61,0.18,0.29,0.85,0.76,0.56,0.91,0.48,0.91)

list_4 <- data.frame(id, period, pred_dif)

out <- list(list_1, list_2, list_3, list_4)

I want to:

Merge list out with dataframe df of same structure

pred_second <- c(0.4,0.71,0.28,0.39,0.95,0.86,0.66,0.81,0.58,0.81)

df <- data.frame(id, period, pred_second)

I would proceed (in a dplyr environment) as follows:

out <- merge(out, df, by = c("id", "period"), all.x = T)

Create a list containing an OLS (lm) regression capturing the effect of variable "period" on "pred_dif". In a dataframe environment would be something like:

ols <- summary(lm(formula = pred_dif ~ as.factor(period) - 1, data = out))

Create a list or dataframe (preferred) registering the estimates and standard errors of the regressions of point 2 (it is ok if points 2/3 happen together)

Any idea on how to solve this in an iterative and fast way for all lists?

A few advices: -I recommend you do not name your dataframes "list1", "list2", "listxxx". It can get confusing especially when you also have proper lists. -You may be better of with rbind() or rbind.data.frame() than merge in this case. -I would keep these dataframes as a list of dataframes — GuedesBF
– GuedesBF, Commented Jun 11, 2021 at 17:25
Thank you Guedes! How would you convert my data into a list of data frames? — vog
– vog, Commented Jun 11, 2021 at 22:32

akrun · Accepted Answer · 2021-06-11 22:47:07Z

We could do this in tidyverse

Loop over the list with map
Do the left_join
Build the lm model in summarise
Convert the output to a tidy dataset
unnest the list of tibble and store in the list (or use _dfr in map to return a single data with .id specified as identifier)

library(purrr)
library(dplyr)
library(broom)
library(tidyr)
lmout_lst <- map(out, 
   ~ left_join(.x, df, by = c('id', 'period')) %>%
      summarise(new = list(lm(pred_dif ~ as.factor(period) - 1) %>% 
                  broom::tidy(.))) %>%
     unnest(new))

It can be converted to a single dataset with bind_rows as well

lmout <- bind_rows(lmout_lst, .id = 'categ')

-output

lmout
# A tibble: 24 x 6
   categ term                estimate std.error statistic  p.value
   <chr> <chr>                  <dbl>     <dbl>     <dbl>    <dbl>
 1 1     as.factor(period)1     0.365    0.223      1.63  0.153   
 2 1     as.factor(period)2     0.6      0.316      1.90  0.106   
 3 1     as.factor(period)8     0.62     0.223      2.78  0.0321  
 4 1     as.factor(period)9     0.1      0.316      0.317 0.762   
 5 1     as.factor(period)11    0.37     0.316      1.17  0.286   
 6 1     as.factor(period)12    0.412    0.141      2.92  0.0267  
 7 2     as.factor(period)1     0.54     0.0789     6.85  0.000478
 8 2     as.factor(period)2     0.63     0.112      5.65  0.00132 
 9 2     as.factor(period)8     0.27     0.0789     3.42  0.0141  
10 2     as.factor(period)9     0.18     0.112      1.61  0.158   
# … with 14 more rows

Thank you Akrun! Any idea on how to proceed in tasks 2) and 3)?
@vog do you want to do the merge in each data in list and create the model
Thank you Akrun! Is it possible to divide your steps into 2 different processes? First process is the creation of the left_join (1. and 2.) and second process is the estimation (as I want to loop for all the different individual variables in the dataset)?
here I have posted the question in a clearer manner stackoverflow.com/questions/67970525/…

Collectives™ on Stack Overflow

Merge lists with a dataframe

1 Answer 1

6 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Linked

Related