以编程方式通过 glm() 将 family= 传递给 step()
pass family= to step() via glm() programmatically
我试图通过模拟来演示不同模型的性能和特征选择技术,因此我希望以编程方式将各种参数传递给 glm()
。
在?glm
下我们读到(斜体是我的):
family: a description of the error distribution and link function to be used in the model. For glm this can be a character string naming a
family function, a family function or the result of a call to a family function. For glm.fit only the third option is supported. (See family for details of family functions.)
问题是当我随后在生成的模型上调用 step()
时,似乎存在范围界定问题并且不再识别 family=
参数。
这是一个最小的例子:
getCoef <- function(formula,
family = c("gaussian", "binomial"),
data){
model_fam <- match.arg(family, c("gaussian", "binomial"))
fit_null <- glm(update(formula,".~1"),
family = model_fam,
data = data)
message("So far so good")
fit_stepBIC <- step(fit_null,
formula,
direction="forward",
k = log(nrow(data)),
trace=0)
message("Doesn't make it this far")
fit_stepBIC$coefficients
}
# returns error 'model_fam' not found
getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian", data = iris)
带回溯的错误消息:
> getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian", data = iris)
So far so good
Error in stats::glm(formula = Petal.Length ~ Petal.Width + Species, family = model_fam, :
object 'model_fam' not found
9 stats::glm(formula = Petal.Length ~ Petal.Width + Species, family = model_fam,
data = data, method = "model.frame")
8 eval(expr, envir, enclos)
7 eval(fcall, env)
6 model.frame.glm(fob, xlev = object$xlevels)
5 model.frame(fob, xlev = object$xlevels)
4 add1.glm(fit, scope$add, scale = scale, trace = trace, k = k,
...)
3 add1(fit, scope$add, scale = scale, trace = trace, k = k, ...)
2 step(fit_null, formula, direction = "forward", k = log(nrow(data)),
trace = 0)
1 getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian",
data = iris)
> sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] rsconnect_0.4.1.11 tools_3.2.4
传递此参数以便被步骤识别的最自然方式是什么?我知道的一种可能的解决方法是通过以 model_fam
.
为条件的 if-then-else 使用显式姓氏调用 glm()
我认为以下基于 eval
、bquote
和 .()
的解决方案可能会解决您的问题。
我还安装了 R-version 3.2.4,我得到了与您从代码中得到的完全相同的错误。下面的解决方案使它在我的电脑上工作。
getCoef <- function(formula,
family = c("gaussian", "binomial"),
data){
model_fam <- match.arg(family, c("gaussian", "binomial"))
fit_null <- eval(bquote(
glm(update(.(formula),".~1"),
family = .(model_fam),
data = .(data))))
message("So far so good")
fit_stepBIC <- step(fit_null,
formula,
direction="forward",
k = log(nrow(data)),
trace=0)
message("Doesn't make it this far")
fit_stepBIC$coefficients
}
# returns error 'model_fam' not found
getCoef(formula = Petal.Length ~ Petal.Width + Species,
family = "gaussian",
data = iris)
So far so good
Doesn't make it this far
(Intercept) Speciesversicolor Speciesvirginica Petal.Width
1.211397 1.697791 2.276693 1.018712
问题是step
最终调用了model.frame
并且model.frame
在一个特殊的环境,即定义公式的环境中计算terms对象。这通常是调用 getCoef
的环境。但是在这个环境中 model_fam
不存在,因为它是在 getCoef
中定义的。修复它的一种方法是添加
environment(formula) <- environment()
在
之后
model_fam <- match.arg(family, c("gaussian", "binomial"))
或类似的东西。
我试图通过模拟来演示不同模型的性能和特征选择技术,因此我希望以编程方式将各种参数传递给 glm()
。
在?glm
下我们读到(斜体是我的):
family: a description of the error distribution and link function to be used in the model. For glm this can be a character string naming a family function, a family function or the result of a call to a family function. For glm.fit only the third option is supported. (See family for details of family functions.)
问题是当我随后在生成的模型上调用 step()
时,似乎存在范围界定问题并且不再识别 family=
参数。
这是一个最小的例子:
getCoef <- function(formula,
family = c("gaussian", "binomial"),
data){
model_fam <- match.arg(family, c("gaussian", "binomial"))
fit_null <- glm(update(formula,".~1"),
family = model_fam,
data = data)
message("So far so good")
fit_stepBIC <- step(fit_null,
formula,
direction="forward",
k = log(nrow(data)),
trace=0)
message("Doesn't make it this far")
fit_stepBIC$coefficients
}
# returns error 'model_fam' not found
getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian", data = iris)
带回溯的错误消息:
> getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian", data = iris)
So far so good
Error in stats::glm(formula = Petal.Length ~ Petal.Width + Species, family = model_fam, :
object 'model_fam' not found
9 stats::glm(formula = Petal.Length ~ Petal.Width + Species, family = model_fam,
data = data, method = "model.frame")
8 eval(expr, envir, enclos)
7 eval(fcall, env)
6 model.frame.glm(fob, xlev = object$xlevels)
5 model.frame(fob, xlev = object$xlevels)
4 add1.glm(fit, scope$add, scale = scale, trace = trace, k = k,
...)
3 add1(fit, scope$add, scale = scale, trace = trace, k = k, ...)
2 step(fit_null, formula, direction = "forward", k = log(nrow(data)),
trace = 0)
1 getCoef(Petal.Length ~ Petal.Width + Species, family = "gaussian",
data = iris)
> sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] rsconnect_0.4.1.11 tools_3.2.4
传递此参数以便被步骤识别的最自然方式是什么?我知道的一种可能的解决方法是通过以 model_fam
.
glm()
我认为以下基于 eval
、bquote
和 .()
的解决方案可能会解决您的问题。
我还安装了 R-version 3.2.4,我得到了与您从代码中得到的完全相同的错误。下面的解决方案使它在我的电脑上工作。
getCoef <- function(formula,
family = c("gaussian", "binomial"),
data){
model_fam <- match.arg(family, c("gaussian", "binomial"))
fit_null <- eval(bquote(
glm(update(.(formula),".~1"),
family = .(model_fam),
data = .(data))))
message("So far so good")
fit_stepBIC <- step(fit_null,
formula,
direction="forward",
k = log(nrow(data)),
trace=0)
message("Doesn't make it this far")
fit_stepBIC$coefficients
}
# returns error 'model_fam' not found
getCoef(formula = Petal.Length ~ Petal.Width + Species,
family = "gaussian",
data = iris)
So far so good
Doesn't make it this far
(Intercept) Speciesversicolor Speciesvirginica Petal.Width
1.211397 1.697791 2.276693 1.018712
问题是step
最终调用了model.frame
并且model.frame
在一个特殊的环境,即定义公式的环境中计算terms对象。这通常是调用 getCoef
的环境。但是在这个环境中 model_fam
不存在,因为它是在 getCoef
中定义的。修复它的一种方法是添加
environment(formula) <- environment()
在
之后model_fam <- match.arg(family, c("gaussian", "binomial"))
或类似的东西。