调整使用 lmerTest::lmer() 获得的 p 值以进行多重比较

Question

我想使用 lmerTest::lmer() 拟合线性混合模型并逐渐添加随机和固定效应（见下面的代码）。稍后我的目标是编译一个回归 table，包括所有具有 jtools::export_summs() 或 huxtable::huxreg() 的模型。在此步骤之前，我想使用 Bonferroni-Holm (BH) 方法调整回归中获得的 p 值以进行多重比较。

我将每个调整后的模型存储在一个列表中，并编写了一个函数来将 BH 应用于我的模型，如下所示：

summary(glht(model), test = adjusted('holm'))

但是，当我通过 huxreg(list_lm_models_adj) 或 export_summs(list_lm_models_adj) 编译带有调整模型的列表的回归 table 时，我收到以下错误消息：

"Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column"

检查调整后和未调整模型的摘要表明，在应用 summary(glht(model), test = adjusted('holm')) 时结构似乎发生了变化。比较 summary(model_lm2) 和 summary(model2_adjusted) 的输出，随机效应似乎在转换中丢失了。

# Define models
# ------------------------------------------------------------------
# base model: fixed effect: cat1
model_lm0  <- lm(likertscore ~ cat1, data = df_long)  

# + random effect: subject => (1 | subject)
model_lm  <- lmer(likertscore ~ cat1         + (1 | subject),                      data = df_long)  

# + fixed effect: index   => + index
model_lm1 <- lmer(likertscore ~ cat1 + index + (1 | subject),                      data = df_long)  

# full model
# + random effect: group   => (1 | group)
model_lm2 <- lmer(likertscore ~ cat1 + index + (1 | subject) + (1 | group),        data = df_long)  

# 1) unadjusted models => regression table
# ------------------------------------------------------------------
# Store models in list and output regression table
list_lm_models <- list()
list_lm_models[["model_lm"]] <- model_lm
list_lm_models[["model_lm1"]] <- model_lm1
list_lm_models[["model_lm2"]] <- model_lm2
huxreg(list_lm_models)
# ==> provides regression table with unadjusted p-values


# 2) adjusted models => regression table
# ------------------------------------------------------------------
# Function to adjust p-values
adjMC <- function( model_name ) {
    model_glht <- glht(model_name)
    model_MCadj <- summary(model_glht, test = adjusted('holm')) # Bonferroni-Holm 
  return(model_MCadj)
}

# Apply function to models
model_lm_adj <- adjMC( model_name = model_lm )  
model_lm1_adj <- adjMC( model_name = model_lm1 )  
model_lm2_adj <- adjMC( model_name = model_lm2 )  

# Store adjusted models in list and output regression table
list_lm_models_adj <- list()
list_lm_models_adj[["model_lm"]] <- model_adjusted
list_lm_models_adj[["model_lm1"]] <- model_lm1_adj
list_lm_models_adj[["model_lm2"]] <- model_lm2_adj
huxreg(list_lm_models_adj)

非常感谢任何帮助！

AddOn1:

分别调用huxreg(list_lm_models_adj)或export_summs(list_lm_models_adj)时出现错误

df_long 看起来如下：

'data.frame':   1715 obs. of  5 variables:
 $ subject     : Factor w/ 245 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ task        : Factor w/ 7 levels "Q1_Level1",..: 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "contrasts")= num [1:7, 1:6] 0.25 0.25 0.25 0.25 -0.333 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr  "Q1_Level1" "Q1_Level2" "Q1_Level3" "Q1_Level4" ...
  .. .. ..$ : chr  "c1_CogMem_vs_MechFun" "c2_Cog_vs_Mem" "c3_CogOnly_Math_vs_Words" "c4_MemOnly_Codes_vs_Encrypt" ...
 $ likertscore : int  4 3 4 7 4 7 4 7 7 2 ...
 $ index       : int  7 7 7 7 7 7 7 7 7 7 ...
 $ session_code: Factor w/ 24 levels "1t75nw8b","2wwkn7pm",..: 15 15 15 15 15 15 15 15 15 15 ...


 > headTail(df_long,8,8)
      subject              task likertscore index session_code
 1          1 Q1_Level1           4     7     lo0h31ts
 2          2 Q1_Level1           3     7     lo0h31ts
 3          3 Q1_Level1           4     7     lo0h31ts
 4          4 Q1_Level1           7     7     lo0h31ts
 5          5 Q1_Level1           4     7     lo0h31ts
 6          6 Q1_Level1           7     7     lo0h31ts
 7          7 Q1_Level1           4     7     lo0h31ts
 8          8 Q1_Level1           7     7     lo0h31ts
 ...     <NA>              <NA>         ...   ...         <NA>
 1708     238 Q1_Level7           1     2     5tc0tw92
 1709     239 Q1_Level7           3     2     5tc0tw92
 1710     240 Q1_Level7           3     5     v9z7sllr
 1711     241 Q1_Level7           4     5     v9z7sllr
 1712     242 Q1_Level7           2     5     v9z7sllr
 1713     243 Q1_Level7           1     5     v9z7sllr
 1714     244 Q1_Level7           4     5     v9z7sllr
 1715     245 Q1_Level7           3     5     v9z7sllr

AddOn2：最小工作示例

# MWE
# ------------------------------------------------------------------
library("tidyverse")
library("lmerTest")
library("multcomp")
library("huxtable") # or alternatively
# library("jtools")

states <- as.data.frame(state.x77)
df_wide <- states[, c("Frost", "Area")]
colnames(df_wide) <- c("cat1_level1", "cat1_level2")
# add column with "SubjectIDs":
df_wide$subject <- c(paste0("S", 1:(nrow(df_wide))))
df_long <- df_wide %>% 
    gather(cat1, likertscore, -subject)

# Define models
# ------------------------------------------------------------------
# base model: fixed effect: cat1
model_lm0   <- lm(likertscore ~ cat1, data = df_long)  
# + random effect: subject => (1 | subject)
model_lm    <- lmer(likertscore ~ cat1 + (1 | subject), data = df_long)  

# 1) unadjusted models => regression table
# ------------------------------------------------------------------
# Store models in list and output regression table
list_lm_models <- list()
list_lm_models[["model_lm0"]] <- model_lm0
list_lm_models[["model_lm"]] <- model_lm
huxreg(list_lm_models)
# ==> provides regression table with unadjusted p-values


# 2) adjusted models => regression table
# ------------------------------------------------------------------
# Function to adjust p-values
adjMC <- function( model_name ) {
    model_glht <- glht(model_name)
    model_MCadj <- summary(model_glht, test = adjusted('holm')) # Bonferroni-Holm 
  return(model_MCadj)
}

# Apply function to models
model_lm0_adj <- adjMC( model_name = model_lm0 )  
model_lm_adj <- adjMC( model_name = model_lm )  

# Store adjusted models in list and output regression table
list_lm_models_adj <- list()
list_lm_models_adj[["model_lm0"]] <- model_lm0_adj
list_lm_models_adj[["model_lm"]] <- model_lm_adj
huxreg(list_lm_models_adj)        # huxtable
# export_summs(list_lm_models_adj)  # jtools wrapper for huxtable::huxreg
# ==>  Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column

AddOn3: REPREX

library("tidyverse")
library("lmerTest")
#> Loading required package: lme4
#> Loading required package: Matrix
#> 
#> Attaching package: 'Matrix'
#> The following object is masked from 'package:tidyr':
#> 
#>     expand
#> 
#> Attaching package: 'lmerTest'
#> The following object is masked from 'package:lme4':
#> 
#>     lmer
#> The following object is masked from 'package:stats':
#> 
#>     step
library("multcomp")
#> Loading required package: mvtnorm
#> Loading required package: survival
#> Loading required package: TH.data
#> Loading required package: MASS
#> 
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
#> 
#> Attaching package: 'TH.data'
#> The following object is masked from 'package:MASS':
#> 
#>     geyser
library("huxtable") # or alternatively
#> 
#> Attaching package: 'huxtable'
#> The following object is masked from 'package:dplyr':
#> 
#>     add_rownames
#> The following object is masked from 'package:purrr':
#> 
#>     every
#> The following object is masked from 'package:ggplot2':
#> 
#>     theme_grey
# library("jtools")
states <- as.data.frame(state.x77)
df_wide <- states[, c("Frost", "Area")]
colnames(df_wide) <- c("cat1_level1", "cat1_level2")
# add column with "SubjectIDs":
df_wide$subject <- c(paste0("S", 1:(nrow(df_wide))))
df_long <- df_wide %>%
  gather(cat1, likertscore,-subject)

# Define models
# base model: fixed effect: cat1
model_lm0   <- lm(likertscore ~ cat1, data = df_long)  
# + random effect: subject => (1 | subject)
model_lm    <- lmer(likertscore ~ cat1 + (1 | subject), data = df_long)  

# 1) unadjusted models => regression table
# Store models in list and output regression table
list_lm_models <- list()
list_lm_models[["model_lm0"]] <- model_lm0
list_lm_models[["model_lm"]] <- model_lm
huxreg(list_lm_models)
#> Warning in checkMatrixPackageVersion(): Package version inconsistency detected.
#> TMB was built with Matrix version 1.2.15
#> Current Matrix version is 1.2.17
#> Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package
#> Warning in knit_print.huxtable(x, ...): Unrecognized output format "markdown". Using `to_screen` to print huxtables.
#> Set options("huxtable.knitr_output_format") manually to "latex", "html", "rtf", "docx", "pptx", "md" or "screen".
───────────────────────────────────────────────────── model_lm0 model_lm
─────────────────────────────────── (Intercept) 104.460     104.460    
(8532.732)    (8532.732)   
cat1cat1_level2 70631.420 *** 70631.420 ***
(12067.105)    (12064.550)   
sd__(Intercept)          1241.576    
         (NA)        
sd__Observation          60322.752    
         (NA)        
─────────────────────────────────── N 100         100        
R2 0.259             
logLik -1241.651     -1221.720    
AIC 2489.303     2451.441    
───────────────────────────────────────────────────── *** p < 0.001; ** p < 0.01; * p < 0.05.

Column names: names, model_lm0, model_lm

# ==> provides regression table with unadjusted p-values


# 2) adjusted models => regression table
# Function to adjust p-values
adjMC <- function( model_name ) {
  model_glht <- glht(model_name)
  model_MCadj <- summary(model_glht, test = adjusted('holm')) # Bonferroni-Holm 
  return(model_MCadj)
}

# Apply function to models
model_lm0_adj <- adjMC( model_name = model_lm0 )  
model_lm_adj <- adjMC( model_name = model_lm )  

# Store adjusted models in list and output regression table
list_lm_models_adj <- list()
list_lm_models_adj[["model_lm0"]] <- model_lm0_adj
list_lm_models_adj[["model_lm"]] <- model_lm_adj
huxreg(list_lm_models_adj)        # huxtable
#> Warning: Unknown or uninitialised column: 'term'.

#> Warning: Unknown or uninitialised column: 'term'.
#> Error in fix.by(by.x, x): 'by' must specify a uniquely valid column
# export_summs(list_lm_models_adj)  # jtools wrapper for huxtable::huxreg
# ==>  Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column 
Created on 2019-08-24 by the reprex package (v0.3.0)

Answer 1

你的问题可以被

看到

tidy(model_lm_adj)
# A tibble: 2 x 6
  lhs               rhs estimate std.error statistic       p.value
  <chr>           <dbl>    <dbl>     <dbl>     <dbl>         <dbl>
1 (Intercept)         0     104.     8533.    0.0122 0.990        
2 cat1cat1_level2     0   70631.    12067.    5.85   0.00000000963

来自?huxreg：

Models must have a generics::tidy() method defined, which should return "term", "estimate", "std.error", "statistic" and "p.value".

summary.glht class 有一个 tidy 方法，但它没有 return 一个 "term" 列。所以 huxreg 感到困惑。在对使用它的统计数据包执行标准时，broom 数据包介于 "anything goes" 和 "whips and chains" 之间。

我会尝试改进报错代码。同时，您可能想使用 tidy_override:

adj_override <- function( model ) {
  model_glht <- glht(model)
  model_mc_adj <- summary(model_glht, test = adjusted('holm')) # Bonferroni-Holm 
  pvals <- tidy(model_mc_adj)$p.value
  return(tidy_override(model, p.value = pvals))
}
overridden_models <- lapply(list_lm_models, adj_override)
huxreg(overridden_models)

顺便说一句，您在这里可能不需要 glht 的所有机器。你可以只使用 stats::p.adjust.

调整使用 lmerTest::lmer() 获得的 p 值以进行多重比较

Adjust p-values obtained with lmerTest::lmer() for multiple comparisons

r

lme4

p-value

lmertest