获取 glmer 模型的标准化系数?

Getting standardized coefficients for a glmer model?

有人要求我提供 glmer 模型的标准化系数,但我不确定如何获得它们。不幸的是,beta 函数不适用于 glmer 型号:

Error in UseMethod("beta") : 
  no applicable method for 'beta' applied to an object of class "c('glmerMod', 'merMod')"

还有其他我可以使用的功能吗,还是我必须自己写一个?

另一个问题是该模型包含多个连续预测变量(在相似的尺度上运行)和 2 个分类预测变量(一个有 4 个水平,一个有 6 个水平)。使用标准化系数的目的是将分类预测变量的影响与连续预测变量的影响进行比较,我不确定标准化系数是否适合这样做。标准化系数是一种可接受的方法吗?

型号如下:

model=glmer(cbind(nr_corr,maximum-nr_corr) ~ (condition|SUBJECT) + categorical_1 + categorical_2 + continuous_1 + continuous_2 + continuous_3 + continuous_4 + categorical_1:categorical_2 + categorical_1:continuous_3, data, control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=100000)), family = binomial)

reghelper::beta 简单地标准化了我们数据集中的数值变量。因此,假设您的分类变量是 factor 而不是数字虚拟变量或其他对比编码,我们可以相当简单地标准化数据集中的数字变量

vars <- grep('^continuous(.*)?', all.vars(formula(model)))
f <- function(var, data)
   scale(data[[var]])
data[, vars] <- lapply(vars, f, data = data)
update(model, data = data)

现在对于更一般的情况,我们或多或少可以轻松地创建我们自己的 beta.merMod 函数。然而,我们需要考虑标准化 y 是否有意义。例如,如果我们有一个 poisson 模型,那么只有正整数值才有意义。此外,一个问题是是否缩放随机斜率效应,以及首先问这个问题是否有意义。在其中,我假设分类变量被编码为 characterfactor 而不是 numericinteger.

beta.merMod <- function(model, 
                        x = TRUE, 
                        y = !family(model) %in% c('binomial', 'poisson'), 
                        ran_eff = FALSE, 
                        skip = NULL, 
                        ...){
  # Extract all names from the model formula
  vars <- all.vars(form <- formula(model))
  lhs <- all.vars(form[[2]])
  # Get random effects from the 
  ranef <- names(ranef(model))
  # Remove ranef and lhs from vars
  rhs <- vars[!vars %in% c(lhs, ranef)]
  # extract the data used for the model
  env <- environment(form)
  call <- getCall(model)
  data <- get(dname <- as.character(call$data), envir = env)
  # standardize the dataset
  vars <- character()
  if(isTRUE(x))
    vars <- c(vars, rhs)
  if(isTRUE(y))
    vars <- c(vars, lhs)
  if(isTRUE(ran_eff))
    vars <- c(vars, ranef)
  data[, vars] <- lapply(vars, function(var){
    if(is.numeric(data[[var]]))
      data[[var]] <- scale(data[[var]])
    data[[var]]
  })
  # Update the model and change the data into the new data.
  update(model, data = data)
}

该函数适用于线性广义线性混合效应模型(未针对非线性模型进行测试),并且就像来自 reghelper

的其他 beta 函数
library(reghelper)
library(lme4)
# Linear mixed effect model
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
fm2 <- beta(fm1)
fixef(fm1) - fixef(fm2)
(Intercept)        Days 
  -47.10279   -19.68157 

# Generalized mixed effect model
data(cbpp)
# create numeric variable correlated with period
cbpp$nv <- 
  rnorm(nrow(cbpp), mean = as.numeric(levels(cbpp$period))[as.numeric(cbpp$period)])
gm1 <- glmer(cbind(incidence, size - incidence) ~ nv + (1 | herd),
              family = binomial, data = cbpp)
gm2 <- beta(gm1)
fixef(gm1) - fixef(gm2)
(Intercept)          nv 
  0.5946322   0.1401114

但请注意,与 beta 不同,函数 returns 更新模型 不是模型的摘要。

Another problem is that the model contains several continuous predictors (which operate on similar scales) and 2 categorical predictors (one with 4 levels, one with six levels). The purpose of using the standardized coefficients would be to compare the impact of the categorical predictors to those of the continuous ones, and I'm not sure that standardized coefficients are the appropriate way to do so. Are standardized coefficients an acceptable approach?

这是一个很好的问题,更适合 stats.stackexchange,但我不确定答案。

再次感谢你,奥利弗!对于任何对我问题的最后一部分的答案感兴趣的人,

Another problem is that the model contains several continuous predictors (which operate on similar scales) and 2 categorical predictors (one with 4 levels, one with six levels). The purpose of using the standardized coefficients would be to compare the impact of the categorical predictors to those of the continuous ones, and I'm not sure that standardized coefficients are the appropriate way to do so. Are standardized coefficients an acceptable approach?

你可以找到答案here。 tl;dr 无论如何,使用标准化回归系数并不是混合模型的最佳方法,更不用说像我这样的模型了...