mgcv::gam 的非标准评估

Non-standard evaluation with mgcv::gam

我正在制作一个函数,它将对回归函数的未评估调用作为输入,创建一些数据,然后评估调用。这是一个例子:

library(lme4)
compute_fit <- function(m){
  # Generate some data
  df <- data.frame(x = rnorm(100), y = rnorm(100) + x, ID = sample(4, 100, replace = TRUE))
  # Evaluate the call
  eval(m, envir = df)
}

# Create a list of models
models <- list(
  lm = call("lm", quote(list(formula = y ~ x))),
  glm = call("glm", quote(list(formula = y ~ x))),
  lmer = call("lmer", quote(list(formula = y ~ x + (1 | ID))))
)

# Evaluate the call (this works fine)
model_fits <- lapply(models, compute_fit)

我这样做的原因是我正在进行模拟研究,我在许多 Monte Carlo 个样本上拟合许多不同的模型。该函数是内部包的一部分,我想提供模型列表,然后在包内对其进行评估。

我还想使用 mgcv 中的 gam 函数。在 gam 的文档中,以下是关于其 data 参数的说明,这实际上等同于 lm:

的文档

A data frame or list containing the model response variable and covariates required by the formula. By default the variables are taken from environment(formula): typically the environment from which gam is called.

因此,我尝试使用相同的逻辑来计算 gam,认为上面定义的 compute_fit 函数中的 eval(m, envir = df) 应该在 [=21 的环境中计算公式=]:

# Try with gam
library(mgcv)
gamcall = call("gam", quote(list(formula = y ~ x)))    
compute_fit(gamcall)  

但是,此操作失败并显示错误消息:

Error in eval(predvars, data, env): object 'y' not found

我意识到这个错误可能与 有关,但是我的问题是是否有人能想出一个解决方法,让我可以像使用其他模型一样使用 gam功能?据我所知,链接的问题没有提供这个问题的解决方案。

这是一个完整的代表:

set.seed(1)
library(lme4)
#> Loading required package: Matrix
compute_fit <- function(m){
  # Generate some data
  df <- data.frame(x = rnorm(100), ID = rep(1:50, 2))
  df$y <- df$x + rnorm(100, sd = .1)
  # Evaluate the call
  eval(m, envir = df)
}

# Create a list of models
models <- list(
  lm = call("lm", quote(list(formula = y ~ x))),
  glm = call("glm", quote(list(formula = y ~ x))),
  lmer = call("lmer", quote(list(formula = y ~ x + (1 | ID))))
)

# Evaluate the call (this works fine)
model_fits <- lapply(models, compute_fit)

# Try with gam
library(mgcv)
#> Loading required package: nlme
#> 
#> Attaching package: 'nlme'
#> The following object is masked from 'package:lme4':
#> 
#>     lmList
#> This is mgcv 1.8-26. For overview type 'help("mgcv-package")'.
gamcall = call("gam", quote(list(formula = y ~ x)))    
compute_fit(gamcall)    
#> Error in eval(predvars, data, env): object 'y' not found

我会在调用中添加 df 而不是在 df 中进行评估:

compute_fit <- function(m){
  # Generate some data
  set.seed(1)
  df <- data.frame(x <- rnorm(100), y = rnorm(100) + x^3, ID = sample(4, 100, replace = TRUE))
  #add data parameter to call
  m[["data"]] <- quote(df)
  # Evaluate the call
  eval(m)
}

# Create a list of models
models <- list(
  lm = quote(lm(formula = y ~ x)),
  glm = quote(glm(formula = y ~ x)),
  lmer = quote(lmer(formula = y ~ x + (1 | ID))),
  gam = quote(gam(formula = y ~ s(x)))
)

model_fits <- lapply(models, compute_fit)
#works but lmer reports singular fit