变量 X 和测试 'fisher.test' 的 add_p()' 错误，省略了 p 值

Question

当我尝试使用 add_p() 函数获取我的变量（具有 10 个水平）和具有两个水平的分类变量（yes/no).我不确定如何提供可重现的示例。根据试验数据，我想我的 by 变量将是具有 10 个级别的 "T Stage" 变量，而分类变量将是：(1) "Chemotherapy Treatment" 具有 2级别，以及 (2) "Chemotherapy Treatment2" 有 4 个级别。但是这里是我的代码运行.

library(gtsummary)
library(tidyverse)
miro_def %>% 
  select(mheim, age_dx, time_t1d_yrs, gender, collard, fhist_pandz) %>% 
  tbl_summary(by = mheim, missing = "no",
              type = list(c(gender, collard, fhist_pandz, mheim) ~ "categorical"),
              label = list(gender ~ "Gender", 
                           fhist_pandz ~ "Family history of PD", 
                           age_dx ~ "Age at diagnosis", 
                           time_t1d_yrs ~ "Follow-up(years)")) %>% 
  add_p() %>% 
  # style the output with custom header 
  #modify_header(stat_by = "{level}") %>% 
  # convert to kableExtra as_kable_extra(booktabs = TRUE) %>% 
  # reduce font size to make table fit. # you may also use the `latex_options = "scale_down"` argument here. 
  kable_styling(font_size = 7, latex_options = "scale_down")

但是，我确实通过变量（10 个水平）和其他变量（continous/numeric）得到了一个 p 值

我该如何解决这个错误？
在我有提到的多级变量和多级（> 2 级）分类变量的情况下，我应该做些什么来获得 p 值？

变量 'gender' 和测试 'fisher.test' 的 'add_p()' 中存在错误，省略了 p 值： stats::fisher.test(data[[variable]], as.factor(data[[by]]) 中的错误：FEXACT 错误 7（位置）。 LDSTP=18540 对于这个问题来说太小了， (pastp=51.2364, ipn_0:=ipoin[itp=150]=215, stp[ipn_0]=40.6787)。增加工作空间或考虑使用 'simulate.p.value=TRUE' 变量 'collard' 和测试 'fisher.test' 的 'add_p()' 中存在错误，省略了 p 值： stats::fisher.test(data[[variable]], as.factor(data[[by]]) 中的错误：FEXACT 错误 7（位置）。 LDSTP=18570 对于这个问题来说太小了， (pastp=37.0199, ipn_0:=ipoin[itp=211]=823, stp[ipn_0]=23.0304)。增加工作空间或考虑使用 'simulate.p.value=TRUE' 变量 'fhist_pandz' 和测试 'fisher.test' 的 'add_p()' 中存在错误，省略了 p 值： stats::fisher.test(data[[variable]], as.factor(data[[by]]) 中的错误：FEXACT 错误 7（位置）。 LDSTP=18570 对于这个问题来说太小了， (pastp=36.4614, ipn_0:=ipoin[itp=58]=1, stp[ipn_0]=31.8106)。增加工作空间或考虑使用 'simulate.p.value=TRUE'

Answer 1

由于没有人发布答案，以下是我遇到此问题时使用的方法。按照帮助文件 ?gtsummary::add_p.tbl_summary 中给出的示例，我编写了一个运行 fisher.test 并带有 simulate.p.values = TRUE 选项的自定义函数：

## define custom test
fisher.test.simulate.p.values <- function(data, variable, by, ...) {
  result <- list()
  test_results <- stats::fisher.test(data[[variable]], data[[by]], simulate.p.value = TRUE)
  result$p <- test_results$p.value
  result$test <- test_results$method
  result
}

## add p-values to your gtsummary table, using custom test defined above
summary_table %>%
add_p(
  test = list(all_categorical() ~ "fisher.test.simulate.p.values")  # this applies the custom test to all categorical variables
)

您还可以通过将默认 B = 2000 参数更改为上面的 fisher.test() 来修改计算模拟 p-values 的迭代次数。

当然，所有这些都假定首先使用 Fisher 检验是合适的。

Answer 2

因为它为我解决了这个问题，所以我想指出，自从 gtsummary 的 1.3.6 版本以来，add_p() 中有一个选项，您可以使用它指定参数测试函数（即 test.args）。感谢开发者！

来自NEWS：
每个 add_p() 方法现在都有 test.args = argument。使用此参数传递统计方法的附加参数，例如

add_p(test = c(age, marker) ~ "t.test",
      test.args = c(age, marker) ~ list(var.equal = TRUE))

在add_p()帮助中也有说明（即?add_p）。

Answer 3

我遇到了类似的问题。您必须在 add_p().

内使用 test.args 增加您的工作空间

miro_def %>% 
  select(mheim, age_dx, time_t1d_yrs, gender, collard, fhist_pandz) %>% 
  tbl_summary(by = mheim, missing = "no",
              type = list(c(gender, collard, fhist_pandz, mheim) ~ "categorical"),
              label = list(gender ~ "Gender", 
                           fhist_pandz ~ "Family history of PD", 
                           age_dx ~ "Age at diagnosis", 
                           time_t1d_yrs ~ "Follow-up(years)")) %>% 
  add_p(test.args = variable_with_no_pval ~ list(workspace=2e9))

或

add_p(test.args = all_test("fisher.test") ~ list(workspace=2e9))

变量 X 和测试 'fisher.test' 的 add_p()' 错误，省略了 p 值

error in add_p()' for variable X and test 'fisher.test', p-value omitted

r

gtsummary