在 R 中的一批计算中存储斜率、截距和 R2 以进行多元回归
Store slope, intercept, and R2 for multiple regressions in one batch calculation in R
我有一个包含一长串数据的数据框:
Well.ID | Year | Ave.GWE
1 | 2005 | 525
1 | 2006 | 524
1 | 2004 | 523
2 | 2005 | 552
2 | 2006 | 551
2 | 2007 | 550
.
.
.
10 | 2005 | 582
10 | 2006 | 581
10 | 2007 | 580
我已经能够使用 ggplot、facet_rep_wrap、geom_smooth 和 [=23= 为每个 Well.ID 绘制 Years 与 GWE 线性回归的批处理图].现在我想为每个回归创建一个包含以下内容的数据框:
Well.ID | m (slope) | b (intercept) | R2
有谁知道我是否可以通过 forloop 运行 lm 函数并自动存储所有这些信息?
谢谢
我们需要更多的数据来证明。以下内容应与您的结构大致匹配:
set.seed(1)
df <- data.frame(Well.ID = rep(1:5, each = 5),
Year = rep(2005:2009, 5),
Ave.GWE = round(runif(25, 400, 500)))
head(df)
#> Well.ID Year Ave.GWE
#> 1 1 2005 427
#> 2 1 2006 437
#> 3 1 2007 457
#> 4 1 2008 491
#> 5 1 2009 420
#> 6 2 2005 490
我们可以通过
得到你的结果
do.call(rbind, lapply(unique(df$Well.ID), function(d) {
model <- lm(Ave.GWE ~ Year, data = df[df$Well.ID == d,])
data.frame(Well.ID = d, Intercept = coef(model)[1],
Slope = coef(model)[2], r_squared = summary(model)$r.squared,
row.names = NULL)
}))
#> Well.ID Intercept Slope r_squared
#> 1 1 -7581.6 4.0 0.04903163
#> 2 2 40403.1 -19.9 0.80086151
#> 3 3 -26047.8 13.2 0.59000406
#> 4 4 -3948.0 2.2 0.02105080
#> 5 5 28541.8 -14.0 0.42416898
由 reprex package (v2.0.1)
于 2022-02-09 创建
人类可读的解决方案;
modelfun <- function(x){
model <- lm(Ave.GWE ~ Year,x)
coefs <- coefficients(model)
intercept <- coefs[1]
slope <- coefs[2]
rsq <- summary(model)$r.squared
list(intercept = intercept,slope = slope,rsq = rsq)
}
newdf <- data.frame()
for(i in unique(df[['Well.ID']])){
subset_df <- subset(df,Well.ID == i)
modelstored <- modelfun(subset_df)
newrow <- data.frame(Well.ID = i,
m = modelstored$slope,
b = modelstored$intercept,
R2 = modelstored$rsq)
rownames(newrow) <- NULL
newdf <- rbind(newdf,newrow)
}
newdf
输出;
Well.ID m b R2
<dbl> <dbl> <dbl> <dbl>
1 1 0.500 -479. 0.250
2 2 -1.00 2557. 1
3 10 -1.00 2587. 1
我有一个包含一长串数据的数据框:
Well.ID | Year | Ave.GWE
1 | 2005 | 525
1 | 2006 | 524
1 | 2004 | 523
2 | 2005 | 552
2 | 2006 | 551
2 | 2007 | 550
.
.
.
10 | 2005 | 582
10 | 2006 | 581
10 | 2007 | 580
我已经能够使用 ggplot、facet_rep_wrap、geom_smooth 和 [=23= 为每个 Well.ID 绘制 Years 与 GWE 线性回归的批处理图].现在我想为每个回归创建一个包含以下内容的数据框:
Well.ID | m (slope) | b (intercept) | R2
有谁知道我是否可以通过 forloop 运行 lm 函数并自动存储所有这些信息?
谢谢
我们需要更多的数据来证明。以下内容应与您的结构大致匹配:
set.seed(1)
df <- data.frame(Well.ID = rep(1:5, each = 5),
Year = rep(2005:2009, 5),
Ave.GWE = round(runif(25, 400, 500)))
head(df)
#> Well.ID Year Ave.GWE
#> 1 1 2005 427
#> 2 1 2006 437
#> 3 1 2007 457
#> 4 1 2008 491
#> 5 1 2009 420
#> 6 2 2005 490
我们可以通过
得到你的结果do.call(rbind, lapply(unique(df$Well.ID), function(d) {
model <- lm(Ave.GWE ~ Year, data = df[df$Well.ID == d,])
data.frame(Well.ID = d, Intercept = coef(model)[1],
Slope = coef(model)[2], r_squared = summary(model)$r.squared,
row.names = NULL)
}))
#> Well.ID Intercept Slope r_squared
#> 1 1 -7581.6 4.0 0.04903163
#> 2 2 40403.1 -19.9 0.80086151
#> 3 3 -26047.8 13.2 0.59000406
#> 4 4 -3948.0 2.2 0.02105080
#> 5 5 28541.8 -14.0 0.42416898
由 reprex package (v2.0.1)
于 2022-02-09 创建人类可读的解决方案;
modelfun <- function(x){
model <- lm(Ave.GWE ~ Year,x)
coefs <- coefficients(model)
intercept <- coefs[1]
slope <- coefs[2]
rsq <- summary(model)$r.squared
list(intercept = intercept,slope = slope,rsq = rsq)
}
newdf <- data.frame()
for(i in unique(df[['Well.ID']])){
subset_df <- subset(df,Well.ID == i)
modelstored <- modelfun(subset_df)
newrow <- data.frame(Well.ID = i,
m = modelstored$slope,
b = modelstored$intercept,
R2 = modelstored$rsq)
rownames(newrow) <- NULL
newdf <- rbind(newdf,newrow)
}
newdf
输出;
Well.ID m b R2
<dbl> <dbl> <dbl> <dbl>
1 1 0.500 -479. 0.250
2 2 -1.00 2557. 1
3 10 -1.00 2587. 1