0

I have a list of lists like out. In each list I have a dataframe (with the same structure, i.e. same dimensions and variable names (id/period/pred_dif):

id <- c(1,2,3,4,5,1,2,3,4,5)
period <- c(01,09,12,01,08, 02,08,12,11,12)
pred_dif <- c(0.5,0.1,0.15,0.23,0.75,0.6,0.49,0.81,0.37,0.14)

list_1 <- data.frame(id, period, pred_dif)

pred_dif <- c(0.45,0.18,0.35,0.63,0.25,0.63,0.29,0.11,0.17,0.24)

list_2 <- data.frame(id, period, pred_dif)

pred_dif <- c(0.58,0.13,0.55,0.13,0.76,0.3,0.29,0.81,0.27,0.04)

list_3 <- data.frame(id, period, pred_dif)

pred_dif <- c(0.3,0.61,0.18,0.29,0.85,0.76,0.56,0.91,0.48,0.91)

list_4 <- data.frame(id, period, pred_dif)

out <- list(list_1, list_2, list_3, list_4)

I want to:

  1. Merge list out with dataframe df of same structure
pred_second <- c(0.4,0.71,0.28,0.39,0.95,0.86,0.66,0.81,0.58,0.81)

df <- data.frame(id, period, pred_second)

I would proceed (in a dplyr environment) as follows:

out <- merge(out, df, by = c("id", "period"), all.x = T)
  1. Create a list containing an OLS (lm) regression capturing the effect of variable "period" on "pred_dif". In a dataframe environment would be something like:
ols <- summary(lm(formula = pred_dif ~ as.factor(period) - 1, data = out))
  1. Create a list or dataframe (preferred) registering the estimates and standard errors of the regressions of point 2 (it is ok if points 2/3 happen together)

Any idea on how to solve this in an iterative and fast way for all lists?

2
  • A few advices: -I recommend you do not name your dataframes "list1", "list2", "listxxx". It can get confusing especially when you also have proper lists. -You may be better of with rbind() or rbind.data.frame() than merge in this case. -I would keep these dataframes as a list of dataframes Commented Jun 11, 2021 at 17:25
  • Thank you Guedes! How would you convert my data into a list of data frames? Commented Jun 11, 2021 at 22:32

1 Answer 1

2

We could do this in tidyverse

  1. Loop over the list with map
  2. Do the left_join
  3. Build the lm model in summarise
  4. Convert the output to a tidy dataset
  5. unnest the list of tibble and store in the list (or use _dfr in map to return a single data with .id specified as identifier)
library(purrr)
library(dplyr)
library(broom)
library(tidyr)
lmout_lst <- map(out, 
   ~ left_join(.x, df, by = c('id', 'period')) %>%
      summarise(new = list(lm(pred_dif ~ as.factor(period) - 1) %>% 
                  broom::tidy(.))) %>%
     unnest(new))

It can be converted to a single dataset with bind_rows as well

lmout <- bind_rows(lmout_lst, .id = 'categ')

-output

lmout
# A tibble: 24 x 6
   categ term                estimate std.error statistic  p.value
   <chr> <chr>                  <dbl>     <dbl>     <dbl>    <dbl>
 1 1     as.factor(period)1     0.365    0.223      1.63  0.153   
 2 1     as.factor(period)2     0.6      0.316      1.90  0.106   
 3 1     as.factor(period)8     0.62     0.223      2.78  0.0321  
 4 1     as.factor(period)9     0.1      0.316      0.317 0.762   
 5 1     as.factor(period)11    0.37     0.316      1.17  0.286   
 6 1     as.factor(period)12    0.412    0.141      2.92  0.0267  
 7 2     as.factor(period)1     0.54     0.0789     6.85  0.000478
 8 2     as.factor(period)2     0.63     0.112      5.65  0.00132 
 9 2     as.factor(period)8     0.27     0.0789     3.42  0.0141  
10 2     as.factor(period)9     0.18     0.112      1.61  0.158   
# … with 14 more rows
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you Akrun! Any idea on how to proceed in tasks 2) and 3)?
@vog do you want to do the merge in each data in list and create the model
Exactly as you describe
Thank you Akrun! Is it possible to divide your steps into 2 different processes? First process is the creation of the left_join (1. and 2.) and second process is the estimation (as I want to loop for all the different individual variables in the dataset)?
here I have posted the question in a clearer manner stackoverflow.com/questions/67970525/…
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.