如何将公式作为参数传递给 lm in sapply?
How to pass a formula as a parameter to lm in sapply?
好像lm
在sapply
里面的时候不会把公式作为参数。
就lm
虽然 lm
单独接受公式参数 FO
,
summary(lm(y ~ x, df1, df1[["z"]] == 1, df1[["w"]]))$coef[1, ]
summary(lm(FO, data, data[[st]] == st1, data[[ws]]))$coef[1, ]
lm
在 sapply
同一个sapply
sapply(unique(df1$z), function(s)
summary(lm(y ~ x, df1, df1[["z"]] == s, df1[[ws]]))$coef[1, ])
sapply(unique(data[[st]]), function(s)
summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ])
导致错误:
Error in eval(substitute(subset), data, env) : object 's' not found
将除公式 FO
以外的所有内容都作为参数时,它仍然有效:
sapply(unique(data[[st]]), function(s)
summary(lm(y ~ x, data, data[[st]] == s, data[[ws]]))$coef[1, ])
lm
在 for
循环中
所有参数在 for
循环中工作:
m <- matrix(NA, 4, length(unique(data[[st]])))
for (s in unique(data[[st]])) {
m[, s] <- summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ]
}
m
# [,1] [,2] [,3]
# [1,] 1.6269038 -0.1404174 -0.010338774
# [2,] 0.9042738 0.4577001 1.858138516
# [3,] 1.7991275 -0.3067890 -0.005564049
# [4,] 0.3229600 0.8104951 0.996457853
数据:
df1 <- structure(list(x = c(1.37095844714667, -0.564698171396089, 0.363128411337339,
0.63286260496104, 0.404268323140999, -0.106124516091484, 1.51152199743894,
-0.0946590384130976, 2.01842371387704), y = c(1.30824434809425,
0.740171482827397, 2.64977380403845, -0.755998096151299, 0.125479556323628,
-0.239445852485142, 2.14747239550901, -0.37891195982917, -0.638031707027734
), z = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), w = c(0.7, 0.8,
1.2, 0.9, 1.3, 1.2, 0.8, 1, 1)), class = "data.frame", row.names = c(NA,
-9L))
FO <- y ~ x; data <- df1; st <- "z"; ws <- "w"; st1 <- 1
sessionInfo()
:
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.0 tools_3.6.0 yaj_0.0.0.9044 packrat_0.5.0
这在我尝试时有效。似乎您在公式中使用 x 会干扰您希望函数的行为方式。用 num 替换此参数会生成听起来像您正在寻找的结果。这样可以确保公式中的 x 指的是数据集而不是函数参数。
sapply(unique(dat$z), function(num) summary(lm(y ~ x, dat, z == num))$coef[1, ])
感谢来自 @David 的提示,来自 R-help 的尝试使用 do.call
我可以弄清楚。解决方案是:
sapply(unique(data[[st]]), function(s)
summary(do.call("lm", list(FO, data, data[[st]] == s,
data[[ws]])))$coef[1, ])
# [,1] [,2] [,3]
# Estimate 1.6269038 -0.1404174 -0.010338774
# Std. Error 0.9042738 0.4577001 1.858138516
# t value 1.7991275 -0.3067890 -0.005564049
# Pr(>|t|) 0.3229600 0.8104951 0.996457853
解释:(来自 R-help 的 @Duncan)的调用者sapply
可能会忽略创建公式的附加 > environment(FO)
# <environment: R_GlobalEnv>
。这可能是它与 do.call
和参数列表一起使用的原因。
好像lm
在sapply
里面的时候不会把公式作为参数。
就lm
虽然 lm
单独接受公式参数 FO
,
summary(lm(y ~ x, df1, df1[["z"]] == 1, df1[["w"]]))$coef[1, ]
summary(lm(FO, data, data[[st]] == st1, data[[ws]]))$coef[1, ]
lm
在 sapply
同一个sapply
sapply(unique(df1$z), function(s)
summary(lm(y ~ x, df1, df1[["z"]] == s, df1[[ws]]))$coef[1, ])
sapply(unique(data[[st]]), function(s)
summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ])
导致错误:
Error in eval(substitute(subset), data, env) : object 's' not found
将除公式 FO
以外的所有内容都作为参数时,它仍然有效:
sapply(unique(data[[st]]), function(s)
summary(lm(y ~ x, data, data[[st]] == s, data[[ws]]))$coef[1, ])
lm
在 for
循环中
所有参数在 for
循环中工作:
m <- matrix(NA, 4, length(unique(data[[st]])))
for (s in unique(data[[st]])) {
m[, s] <- summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ]
}
m
# [,1] [,2] [,3]
# [1,] 1.6269038 -0.1404174 -0.010338774
# [2,] 0.9042738 0.4577001 1.858138516
# [3,] 1.7991275 -0.3067890 -0.005564049
# [4,] 0.3229600 0.8104951 0.996457853
数据:
df1 <- structure(list(x = c(1.37095844714667, -0.564698171396089, 0.363128411337339,
0.63286260496104, 0.404268323140999, -0.106124516091484, 1.51152199743894,
-0.0946590384130976, 2.01842371387704), y = c(1.30824434809425,
0.740171482827397, 2.64977380403845, -0.755998096151299, 0.125479556323628,
-0.239445852485142, 2.14747239550901, -0.37891195982917, -0.638031707027734
), z = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), w = c(0.7, 0.8,
1.2, 0.9, 1.3, 1.2, 0.8, 1, 1)), class = "data.frame", row.names = c(NA,
-9L))
FO <- y ~ x; data <- df1; st <- "z"; ws <- "w"; st1 <- 1
sessionInfo()
:
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.0 tools_3.6.0 yaj_0.0.0.9044 packrat_0.5.0
这在我尝试时有效。似乎您在公式中使用 x 会干扰您希望函数的行为方式。用 num 替换此参数会生成听起来像您正在寻找的结果。这样可以确保公式中的 x 指的是数据集而不是函数参数。
sapply(unique(dat$z), function(num) summary(lm(y ~ x, dat, z == num))$coef[1, ])
感谢来自 @David 的提示,来自 R-help 的尝试使用 do.call
我可以弄清楚。解决方案是:
sapply(unique(data[[st]]), function(s)
summary(do.call("lm", list(FO, data, data[[st]] == s,
data[[ws]])))$coef[1, ])
# [,1] [,2] [,3]
# Estimate 1.6269038 -0.1404174 -0.010338774
# Std. Error 0.9042738 0.4577001 1.858138516
# t value 1.7991275 -0.3067890 -0.005564049
# Pr(>|t|) 0.3229600 0.8104951 0.996457853
解释:(来自 R-help 的 @Duncan)的调用者sapply
可能会忽略创建公式的附加 > environment(FO)
# <environment: R_GlobalEnv>
。这可能是它与 do.call
和参数列表一起使用的原因。