0

I want to:

  1. merge list out with dataframe df
  2. estimate an lm() model
id <- c(1,2,3,4,5,1,2,3,4,5)
quarter <- c("1","2","1","1","2", "3","1","1","3","3")
month <- c(3,4,2,1,5,7,3,1,8,9)
pred_dif <- c(0.5,0.1,0.15,0.23,0.75,0.6,0.49,0.81,0.37,0.14)

list_1 <- data.frame(id, pred_dif, month)

pred_dif <- c(0.45,0.18,0.35,0.63,0.25,0.63,0.29,0.11,0.17,0.24)

list_2 <- data.frame(id, pred_dif, month)

pred_dif <- c(0.58,0.13,0.55,0.13,0.76,0.3,0.29,0.81,0.27,0.04)

list_3 <- data.frame(id, pred_dif, month)

pred_dif <- c(0.3,0.61,0.18,0.29,0.85,0.76,0.56,0.91,0.48,0.91)

list_4 <- data.frame(id, pred_dif, month)

out <- list(list_1, list_2, list_3, list_4)


pred_second <- c(0.4,0.71,0.28,0.39,0.95,0.86,0.66,0.81,0.58,0.81)
df <- data.frame(id, quarter, pred_second, month)



library(purrr)
library(dplyr)
library(broom)
library(tidyr)
lmout_lst <- map(out, 
                 ~ left_join(.x, df, by = c('id', 'month')) %>%
                   group_by(quarter) %>%
                   summarise(new = list(lm(pred_dif ~ as.factor(month) - 1) %>% 
                                          broom::tidy(.))) %>%
                   unnest(new))

The problem happens in ols_list_reg. In particular with the "group_by" command.

Any idea why this is happening and possible solutions?

1
  • Thank you @Ronak The code doesnt really fail if you try now out[[1]] %>% filter(quarter == '1') %>% {lm(pred_dif ~ as.factor(month) - 1, data = .)} Commented Jun 15, 2021 at 0:19

2 Answers 2

1

Perhaps, you can try this -

library(tidyverse)

map(out, 
    ~ left_join(.x, df, by = c('id', 'month')) %>%
      group_by(quarter) %>%
      summarise(new = list({
            tryCatch(lm(pred_dif ~ as.factor(month) - 1) %>% broom::tidy(.), 
                     error = function(e) tibble(estimate = NA))
        })) %>%
      unnest(new)
)

If you want to combine all the results together use map_df instead of map.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks @Ronak! Was my mistake to include quarter in the list out. Variable quarter belongs only to dataframe df. Only after merge both the list and the dataframe we end up having variable "quarter" in list out. Therefore merging by quarter is not possible in my original task
So you can remove it from by in left_join.
For any reason this work in the example I have designed but does not work under my real data Error: Problem with `summarise()` input `new`. x contrasts can be applied only to factors with 2 or more levels ℹ Input `new` is `list(lm(TE_indiv ~ as.factor(size) - 1) %>% broom::tidy(.))`. ℹ The error occurred in group 5: quarter = NA. In my real case, size is a character variable. If I group by month all process goes correctly, if I group by quarter the process corrupts under the error below. It happens that character vector quarter doesnt have NA's. Any idea?
In that case you have to use tryCatch to catch those errors. See if my updated answer helps in your real data.
1

As @RonakShah says, your code fails for an individual element of the list. It's not at all clear what you're trying to achieve, but

out %>% 
  bind_rows(.id="element") %>% 
  left_join(df, by=c("id", "period")) %>% 
  mutate(period=as.factor(period)) %>% 
  group_by(element) %>% 
  group_map(function(.x, .y) lm(pred_dif ~ period-1, data=.x))

at least runs without warning or error and gives possibly sensible output:

[[1]]

Call:
lm(formula = pred_dif ~ period - 1, data = .x)

Coefficients:
period01  period02  period08  period09  period11  period12  
   0.365     0.600     0.620     0.100     0.370     0.412  


[[2]]

Call:
lm(formula = pred_dif ~ period - 1, data = .x)

Coefficients:
period01  period02  period08  period09  period11  period12  
   0.540     0.630     0.270     0.180     0.170     0.232  


[[3]]

Call:
lm(formula = pred_dif ~ period - 1, data = .x)

Coefficients:
period01  period02  period08  period09  period11  period12  
   0.355     0.300     0.525     0.130     0.270     0.552  


[[4]]

Call:
lm(formula = pred_dif ~ period - 1, data = .x)

Coefficients:
period01  period02  period08  period09  period11  period12  
   0.295     0.760     0.705     0.610     0.480     0.618

3 Comments

Thank you @Limey. I think I did not manage to explain the point. The purpose is to explain pred_dif by using the month variable contained in each quarter
I still have no idea what you're trying to achieve. I suggest you provide your expected output and define the process you wish to implement to get to the output, for a single element of the out list. That may give us a chance to implement it. (And, possibly, it may show you how to achieve your desired result yourself.)
I expect to have the same output as follows: lmout_lst <- map(out, ~ left_join(.x, df, by = c('id', 'month')) %>% #group_by(quarter) %>% summarise(new = list(lm(pred_dif ~ as.factor(month) - 1) %>% broom::tidy(.))) %>% unnest(new)) but "estimated" 4 times (one for each quarter) instead of one for each "element"

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.