按年循环 lm

Loop lm by year

我在 R 中有一个数据框,如上所示

Id  ln_W  Year  Exp 
1   2.5   2010   15
1   2.3   2011   16
2   2.1   2010   20
3   2.5   2012   17
3   2.5   2013   18

我想在我的数据集中每年回归 ln_W~Exp 并以列表格式保存结果摘要。

有人知道怎么做吗?

base R中,我们通过Yearsplit,用lapply循环list,用lm创建模型并将输出存储为 list

out <- lapply(split(df1, df1$Year), function(x)
                   lm(ln_W ~ Exp, data = x))

注意:这不需要任何包


或者另一个选项是 lmList 来自 lme4

library(lme4)
lmList(ln_W  ~Exp | Year, data = df1)
#Call: lmList(formula = ln_W ~ Exp | Year, data = df1) 
#Coefficients:
#     (Intercept)   Exp
#2010         3.7 -0.08
#2011         2.3    NA
#2012         2.5    NA
#2013         2.5    NA

#Degrees of freedom: 5 total; -3 residual
#Residual standard error: 0

数据

df1 <- structure(list(Id = c(1L, 1L, 2L, 3L, 3L), ln_W = c(2.5, 2.3, 
2.1, 2.5, 2.5), Year = c(2010L, 2011L, 2010L, 2012L, 2013L), 
    Exp = c(15L, 16L, 20L, 17L, 18L)), class = "data.frame",
    row.names = c(NA, 
-5L))

您可以按 Year 分组,然后将 lm 摘要保存为列表列:

library(tidyverse)

df %>%
  group_by(Year) %>% 
  summarise(fit = list(lm(ln_W ~ Exp, data = cur_data()) %>% summary))

输出:

# A tibble: 4 x 2
   Year fit       
  <int> <list>    
1  2010 <smmry.lm>
2  2011 <smmry.lm>
3  2012 <smmry.lm>
4  2013 <smmry.lm>

通过将 %>% pull(fit) 添加到链中,仅获取摘要列表。
(请注意,对于提供的数据,这些摘要不会显示太多,只是截距,因为没有足够的观察数据来拟合。)

为什么不使用 by.

res <- lapply(by(d, d$Year, lm, formula=ln_W ~ Exp), summary)
res
# $`2010`
# 
# Call:
# FUN(formula = ..1, data = data[x, , drop = FALSE])
# 
# Residuals:
#   ALL 2 residuals are 0: no residual degrees of freedom!
#   
#   Coefficients:
#   Estimate Std. Error t value Pr(>|t|)
# (Intercept)     3.70         NA      NA       NA
# Exp            -0.08         NA      NA       NA
# 
# Residual standard error: NaN on 0 degrees of freedom
# Multiple R-squared:      1,   Adjusted R-squared:    NaN 
# F-statistic:   NaN on 1 and 0 DF,  p-value: NA
# 
# 
# $`2011`
# 
# Call:
# FUN(formula = ..1, data = data[x, , drop = FALSE])
# 
# Residuals:
#   ALL 1 residuals are 0: no residual degrees of freedom!
#   
#   Coefficients: (1 not defined because of singularities)
# Estimate Std. Error t value Pr(>|t|)
# (Intercept)      2.3         NA      NA       NA
# Exp               NA         NA      NA       NA
# 
# Residual standard error: NaN on 0 degrees of freedom
# [...]

数据:

d <- structure(list(Id = c(1L, 1L, 2L, 3L, 3L), ln_W = c(2.5, 2.3, 
2.1, 2.5, 2.5), Year = c(2010L, 2011L, 2010L, 2012L, 2013L), 
    Exp = c(15L, 16L, 20L, 17L, 18L)), class = "data.frame", row.names = c(NA, 
-5L))