在 tabyl 管道中使用 colnames 作为对象的循环

Question

我需要运行对多个列联表进行卡方检验并将它们存储到数据框中。我曾想过使用 tabyl 和 chisq.test 函数。我的原始数据集包含患者症状报告。编造的例子：

数据

df <- structure(list(Race = c("White", "Asian", "White", "Asian", "Black", 
"Asian", "Black", "White"), Headache = c("No", "No", "Yes", "Yes", "No", 
"No", "Yes", "Yes"), Paraesthesias = c("No", "Yes", "Yes", "Yes", "Yes", "No", "No", "No"
), Heartburn = c("Yes", "No", "No", "Yes", "No", "Yes", "Yes", "Yes")), row.names = c(NA, 
-8L), class = "data.frame")

print(df)

暴力破解的预期结果

headache_p <- chisq.test(df[c(1,2)] %>% tabyl(Race, Headache))$p.value

paraesthesias_p <- chisq.test(df[c(1,3)] %>% tabyl(Race, Paraesthesias))$p.value

heartburn_p <- chisq.test(df[c(1,4)] %>% tabyl(Race, Heartburn))$p.value

data.frame("Headache" = headache_p, "Paraesthesias" = paraesthesias_p, "Heartburn" = heartburn_p, row.names = "p.value")

尝试使用循环获得期望的结果

y <- list()

for (i in 2:4) {
    z <- chisq.test(df[c(1, i)] %>% tabyl(Race, colnames(df[i]), show_na = FALSE))
    y <- c(y, z)
}

setNames(data.frame(y, row.names = "p.value"), colnames(df)[-1])

错误信息

Error: Can't extract columns that don't exist.
x Column `Headache` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.

问题

如何为这个过程创建一个 for 循环？我的原始数据集有 60 多个症状，因此需要一个循环。我不知道如何将列名放入管道中，因为它会将其视为字符而不是对象。

Answer 1

您可以在此处使用 map_dbl 来完成 for 循环的工作。

当您将列名作为字符传递时，您可以在 tabyl 中使用 .data 参数。

library(purrr)
library(dplyr)
library(janitor)

cols <- names(df[-1])

map_dbl(cols, ~chisq.test(df %>% tabyl(Race, .data[[.x]]))$p.value) %>%
  t %>%
  as.data.frame() %>%
  setNames(cols)

#   Headache Paraesthesias Heartburn
#1 0.7165313     0.7165313 0.9149472

在 tabyl 管道中使用 colnames 作为对象的循环

For loop utilizing colnames as object in tabyl pipeline

for-loop

r

binary-data

janitor