在 {gtsummary} 中对重叠的分类变量进行排序

Question

require(gtsummary)

test <- structure(list(`1` = c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0), `2` = c(1,0, 0, 0, 0, 1, 0, 1, 0, 0), `3` = c(0, 0, 0, 0, 0, 0, 0, 0, 0,0), `4` = c(1, 1, 0, 0, 1, 0, 0, 0, 0, 0), `5` = c(1, 0, 1, 1,0, 1, 1, 0, 0, 0), `6` = c(0, 0, 0, 1, 0, 0, 1, 0, 0, 0), `7` = c(0,0, 0, 0, 0, 0, 0, 0, 0, 0), `8` = c(0, 0, 0, 0, 0, 0, 0, 0, 0,0), `9` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `10` = c(0, 0, 0,0, 0, 0, 0, 0, 0, 1)), row.names = c(NA, -10L), class = c("tbl_df","tbl", "data.frame"))

在这个示例数据中，我有 10 个分类变量。

     `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1     0     1     0     1     1     0     0     0     0     0
 2     0     0     0     1     0     0     0     0     0     0
 3     0     0     0     0     1     0     0     0     0     0
 4     0     0     0     0     1     1     0     0     0     0
 5     0     0     0     1     0     0     0     0     0     0
 6     0     1     0     0     1     0     0     0     0     0
 7     0     0     0     0     1     1     0     0     0     0
 8     0     1     0     0     0     0     0     0     0     0
 9     1     0     0     0     0     0     0     0     0     0
10     0     0     0     0     0     0     0     0     0     1

因为它们可以相互重叠，所以我把它们放在了不同的列中，使用 0 和 1，表示“是”或“否”具有（或不具有）分类变量。

当 test %>% tbl_summary() 时，它创建：

我想按频率排序，但是

test %>% tbl_summary(sort = list(everything() ~ "frequency"))

无效。

有办法吗？提前谢谢你。

Answer 1

tbl_summary(sort=) 参数对变量中的级别进行排序，而不是变量在 table 中出现的顺序。变量出现在 table 中的顺序与它们在数据框中出现的顺序相同。

我们可以使用下面的代码更新数据框中的顺序。

library(gtsummary)
#> #Uighur
packageVersion("gtsummary")
#> [1] '1.5.0'

test <- structure(list(`1` = c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0), `2` = c(1,0, 0, 0, 0, 1, 0, 1, 0, 0), `3` = c(0, 0, 0, 0, 0, 0, 0, 0, 0,0), `4` = c(1, 1, 0, 0, 1, 0, 0, 0, 0, 0), `5` = c(1, 0, 1, 1,0, 1, 1, 0, 0, 0), `6` = c(0, 0, 0, 1, 0, 0, 1, 0, 0, 0), `7` = c(0,0, 0, 0, 0, 0, 0, 0, 0, 0), `8` = c(0, 0, 0, 0, 0, 0, 0, 0, 0,0), `9` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `10` = c(0, 0, 0,0, 0, 0, 0, 0, 0, 1)), row.names = c(NA, -10L), class = c("tbl_df","tbl", "data.frame")) 

# order variables by prevelence 
prev <- purrr::map_dbl(test, mean) %>% sort(decreasing = TRUE)

test %>%
  select(all_of(names(prev))) %>%
  tbl_summary() %>%
  as_kable() # convert to kable for SO

Characteristic	N = 10
5	5 (50%)
2	3 (30%)
4	3 (30%)
6	2 (20%)
1	1 (10%)
10	1 (10%)
3	0 (0%)
7	0 (0%)
8	0 (0%)
9	0 (0%)

^{由 reprex package (v2.0.1)}

于 2021-12-10 创建

在 {gtsummary} 中对重叠的分类变量进行排序

Sorting overlapping categorical variables in {gtsummary}

r

dplyr

gtsummary