在 R 中对 reformulated() 项使用 anova()

Using anova() on reformulated() terms in R

背景:

我编写了一个包装函数,允许用户指定数据集中变量的名称,以用作 NLME lme() 函数中的预测变量,以拟合分层线性模型。相关部分如下所示:

predictors <- str_c(c(w,x), collapse = " + ")
mod = lme(fixed = reformulate(termlabels = predictors, response = y),
          random = reformulate(termlabels = paste0("1|", cluster_label)),
          data = dat)

其中 w 是 Level-2(集群级别)预测变量,x 是 Level-1(单元级别)预测变量,cluster_label 是变量名称定义独特的集群。例如,如果 w = c("w1","w2)x = "x1" 和我的集群由 school 定义,那么(就拟合模型而言)这相当于调用:

mod = lme(fixed = y ~ w1 + w2 + x1, random = ~1|school, data = tmp)

问题:

函数运行正常。我的问题是我希望能够使用 anova() 来比较两个这样的模型对象,但是当我尝试这样做时,我收到以下错误,由于我使用 reformulate:

Error in reformulate(termlabels = predictors, response = y) : 
  object 'predictors' not found

而且,事实上,当我调用 summary(mod) 时,我在其他返回信息中得到以下行:

Fixed: reformulate(termlabels = predictors, response = y) 

我注意到 anova() 允许通过 ... 传递额外的参数,那么有没有办法使用这个参数来使函数按预期运行?或者,是否有另一种方法可以对我返回的两个模型对象执行似然比检验而不会出现此错误?如果我需要提供更多详细信息,请告诉我。

可重现示例:

这些是 HSB12 数据,可以很好地重现示例。这需要 nlmestringr 包。

fitVAM <- function(dat,y,w,x,cluster_label) {
  library(nlme)
  library(stringr)
  predictors <- str_c(c(w,x),collapse=" + ")
  mod = lme(fixed = reformulate(termlabels = predictors, response = y),
            random = reformulate(termlabels = paste0("1|", cluster_label)),
            data = dat)
  return(mod)
}
dat <- read.csv("http://www.ats.ucla.edu/stat/paperexamples/singer/hsb12.csv")
mod1 <- fitVAM(dat = dat, y = "mathach", w = "meanses", x = "female", cluster_label = "school")
mod2 <- fitVAM(dat = dat, y = "mathach", w = "meanses", x = c("female", "ses"), cluster_label = "school")
anova(mod1,mod2)

可能有更直接的方法,但您可以使用 do.call:

fitVAM <- function(dat,y,w,x,cluster_label) {
  library(nlme)
  library(stringr)
  predictors <- str_c(c(w,x),collapse=" + ")
  params <- list(fixed = reformulate(termlabels = predictors, response = y),
                 random = reformulate(termlabels = paste0("1|", cluster_label)),
                 data = dat)
  mod = do.call(lme, params)
  mod$call[[3]] <- substitute(dat) #otherwise dat is shown expanded in summary and print output
  return(mod)
}
#dat <- read.csv("http://www.ats.ucla.edu/stat/paperexamples/singer/hsb12.csv")
mod1 <- fitVAM(dat = dat, y = "mathach", w = "meanses", x = "female", cluster_label = "school")
mod2 <- fitVAM(dat = dat, y = "mathach", w = "meanses", x = c("female", "ses"), cluster_label = "school")
anova(mod1,mod2)

#     Model df      AIC      BIC    logLik   Test  L.Ratio p-value
#mod1     1  5 46908.59 46942.99 -23449.29                        
#mod2     2  6 46530.02 46571.29 -23259.01 1 vs 2 380.5734  <.0001
#Warning message:
#In anova.lme(mod1, mod2) :
#  fitted objects with different fixed effects. REML comparisons are not meaningful.