使用 tbl_regressioin 在多个估算数据集上计算线性回归模型的全局 p 值

Question

我正在尝试在我的多个估算数据集上使用 gtsummary 包计算全局 p 值。我还没有找到解决方案。我知道 gtsummary 允许在多个推算数据集上生成线性回归，但我认为 add_global_p() 没有为此设置。我知道要计算 MI 数据集的方差分析需要使用 miceadd 包中的 mi.anova 并且 gtsummary 使用 car::anova() 函数。有人对此有解决方案吗？

# loads relevant packages using the pacman package
pacman::p_load(
  tidyverse,   # data management and visualization
  mice,        # for multiple imputation
  gtsummary)   # for tables

# generate a samall sample of the boys dataset for MI
boys_miss <- sample(head(boys,100))

# impute a sample of the boys dataset 
boys_imp <- parlmice(boys_miss,
                        m = 5,
                        maxit = 5,
                        cluster.seed = 1234)

# run linear regression on the imputed dataset
boys_imp %>% 
  with(.,
       lm(wgt ~ reg)
  ) %>% 
  tbl_regression() %>% 
  add_global_p() # when I add this function, I get the below error


x `add_global_p()` uses `car::Anova()` to calculate the global p-value,
and the function returned an error while calculating the p-values.
Is your model type supported by `car::Anova()`?
  Error in UseMethod("vcov") : 
  no applicable method for 'vcov' applied to an object of class "c('mira', 'matrix')"

我希望 table 看起来像这样..

Answer 1

您需要计算 p 值，并使用 modify_table_body() 将其添加到 gtsummary table。示例如下！

# loads relevant packages using the pacman package
pacman::p_load(
  tidyverse,   # data management and visualization
  mice,        # for multiple imputation
  gtsummary)   # for tables

# generate a samall sample of the boys dataset for MI
boys_miss <- sample(head(boys,100))

# impute a sample of the boys dataset 
boys_imp <- parlmice(boys_miss,
                     m = 5,
                     maxit = 5,
                     cluster.seed = 1234)


tbl <- 
  # build linear regression on the imputed dataset
  boys_imp %>% 
  with(lm(wgt ~ reg)) %>% 
  tbl_regression() %>%
  # replace individual p-values with global p-value
  modify_table_body(
    ~ .x %>% 
      select(-p.value) %>%
      full_join(
        miceadds::mi.anova(boys_imp,  formula="wgt ~ reg", type=2) %>%
          as.data.frame() %>%
          tibble::rownames_to_column(var = "variable") %>%
          filter(variable != "Residual") %>%
          mutate(row_type = "label",
                 variable = str_trim(variable)) %>%
          select(variable, row_type, p.value = anova.table.Pr..F.),
        by = c("variable", "row_type")
      )
  )
#> pool_and_tidy_mice(): Tidying mice model with
#> `mice::pool(x) %>% mice::tidy(exponentiate = FALSE, conf.int = TRUE, conf.level = 0.95)`
#> Univariate ANOVA for Multiply Imputed Data (Type 2)  
#> 
#> lm Formula:  wgt ~ reg
#> R^2=0.0619 
#> ..........................................................................
#> ANOVA Table 
#>                   SSQ df1      df2 F value  Pr(>F)    eta2 partial.eta2
#> reg          10.65048   4 3461.232  1.4994 0.19958 0.06191      0.06191
#> Residual    161.39186  NA       NA      NA      NA      NA           NA

^{由 reprex package (v2.0.0)}

于 2021-08-05 创建

使用 tbl_regressioin 在多个估算数据集上计算线性回归模型的全局 p 值

Calculating global p-value for a linear regression model on a multiple imputed datasets using the tbl_regressioin

r

linear-regression

gtsummary