在 R 中的一批计算中存储斜率、截距和 R2 以进行多元回归

Store slope, intercept, and R2 for multiple regressions in one batch calculation in R

我有一个包含一长串数据的数据框:

Well.ID | Year | Ave.GWE
1       | 2005 | 525
1       | 2006 | 524
1       | 2004 | 523
2       | 2005 | 552
2       | 2006 | 551
2       | 2007 | 550
.
.
.
10       | 2005 | 582
10       | 2006 | 581
10       | 2007 | 580

我已经能够使用 ggplot、facet_rep_wrap、geom_smooth 和 [=23= 为每个 Well.ID 绘制 Years 与 GWE 线性回归的批处理图].现在我想为每个回归创建一个包含以下内容的数据框:


Well.ID | m (slope) | b (intercept) | R2

有谁知道我是否可以通过 forloop 运行 lm 函数并自动存储所有这些信息?

谢谢

我们需要更多的数据来证明。以下内容应与您的结构大致匹配:

set.seed(1)

df <- data.frame(Well.ID = rep(1:5, each = 5),
                 Year = rep(2005:2009, 5),
                 Ave.GWE = round(runif(25, 400, 500)))
head(df)
#>   Well.ID Year Ave.GWE
#> 1       1 2005     427
#> 2       1 2006     437
#> 3       1 2007     457
#> 4       1 2008     491
#> 5       1 2009     420
#> 6       2 2005     490

我们可以通过

得到你的结果
do.call(rbind, lapply(unique(df$Well.ID), function(d) {
  model <- lm(Ave.GWE ~ Year, data = df[df$Well.ID == d,])
  data.frame(Well.ID = d, Intercept = coef(model)[1],
             Slope = coef(model)[2], r_squared = summary(model)$r.squared,
             row.names = NULL)
}))
#>   Well.ID Intercept Slope  r_squared
#> 1       1   -7581.6   4.0 0.04903163
#> 2       2   40403.1 -19.9 0.80086151
#> 3       3  -26047.8  13.2 0.59000406
#> 4       4   -3948.0   2.2 0.02105080
#> 5       5   28541.8 -14.0 0.42416898

reprex package (v2.0.1)

于 2022-02-09 创建

人类可读的解决方案;

modelfun <- function(x){
    model <- lm(Ave.GWE ~ Year,x)
    coefs <- coefficients(model)
    
    intercept <- coefs[1]
    slope <- coefs[2]
    rsq <- summary(model)$r.squared
    
    list(intercept = intercept,slope = slope,rsq = rsq)
}

newdf <- data.frame()

for(i in unique(df[['Well.ID']])){
    subset_df <- subset(df,Well.ID == i)
    
    modelstored <- modelfun(subset_df)
    
    newrow <- data.frame(Well.ID = i,
                         m = modelstored$slope,
                         b = modelstored$intercept,
                         R2 = modelstored$rsq)
    rownames(newrow) <- NULL
    newdf <- rbind(newdf,newrow)
}

newdf

输出;

  Well.ID      m     b    R2
    <dbl>  <dbl> <dbl> <dbl>
1       1  0.500 -479. 0.250
2       2 -1.00  2557. 1    
3      10 -1.00  2587. 1