如何在 gtsummary 中将百分比添加到 "unknown"
How to add percentage to "unknown" in gtsummary
我有一个包含很大比例未知数的连续变量。我的顾问要求我在列中将百分比放在它旁边。这个代表模仿了我正在尝试做的事情。
library(tidyverse)
library(gtsummary)
trial %>% # included with gtsummary package
select(trt, age, grade) %>%
tbl_summary()
我正在尝试将未知数的百分比列在未知数旁边,最好是在括号中。看起来像 11 (5.5%)。
有些人回复了关于缺失数据如何出现在我的数据集中的请求,这是它的代表
library(gtsummary)
library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.0.3
#> Warning: package 'readr' was built under R version 4.0.3
library(gtsummary)
df<-
tibble::tribble(
~age, ~sex, ~race, ~weight,
70, "male", "white", 50,
57, "female", "african-american", 87,
64, "male", "white", NA,
46, "male", "white", 49,
87, "male", "hispanic", 51
)
df %>%
select(age,sex,race,weight) %>%
tbl_summary(type = list(age ~ "continuous", weight ~ "continuous"), missing="ifany")
有几种方法可以报告缺失率。我将在下面说明一些,您可以选择最适合您的解决方案。
- 分类变量:我建议您在将数据框传递给
tbl_summary()
之前明确缺失值的因子水平。 NA 值将不再丢失,并且将像变量的任何其他级别一样被计入。
- 连续变量:使用
statistic=
参数报告缺失率。
- 所有变量:使用
add_n()
报告缺失率
library(gtsummary)
trial %>%
select(age, response, trt) %>%
# making the NA value explicit level of factor with `forcats::fct_explicit_na()`
dplyr::mutate(response = factor(response) %>% forcats::fct_explicit_na()) %>%
tbl_summary(
by = trt,
type = all_continuous() ~ "continuous2",
statistic = all_continuous() ~ c("{N_nonmiss}/{N_obs} {p_nonmiss}%",
"{median} ({p25}, {p75})")
) %>%
add_n(statistic = "{n} / {N}")
编辑:在原始发帖人的评论后添加更多示例。
library(gtsummary)
trial %>%
select(age, response, trt) %>%
# making the NA value explicit level of factor with `forcats::fct_explicit_na()`
dplyr::mutate(response = factor(response) %>% forcats::fct_explicit_na(na_level = "Unknown")) %>%
tbl_summary(
by = trt,
type = all_continuous() ~ "continuous2",
missing = "no",
statistic = all_continuous() ~ c("{median} ({p25}, {p75})",
"{N_miss} ({p_miss}%)")
) %>%
# udpating the Unknown label in the `.$table_body`
modify_table_body(
dplyr::mutate,
label = ifelse(label == "N missing (% missing)",
"Unknown",
label)
)
我有一个包含很大比例未知数的连续变量。我的顾问要求我在列中将百分比放在它旁边。这个代表模仿了我正在尝试做的事情。
library(tidyverse)
library(gtsummary)
trial %>% # included with gtsummary package
select(trt, age, grade) %>%
tbl_summary()
我正在尝试将未知数的百分比列在未知数旁边,最好是在括号中。看起来像 11 (5.5%)。
有些人回复了关于缺失数据如何出现在我的数据集中的请求,这是它的代表
library(gtsummary)
library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.0.3
#> Warning: package 'readr' was built under R version 4.0.3
library(gtsummary)
df<-
tibble::tribble(
~age, ~sex, ~race, ~weight,
70, "male", "white", 50,
57, "female", "african-american", 87,
64, "male", "white", NA,
46, "male", "white", 49,
87, "male", "hispanic", 51
)
df %>%
select(age,sex,race,weight) %>%
tbl_summary(type = list(age ~ "continuous", weight ~ "continuous"), missing="ifany")
有几种方法可以报告缺失率。我将在下面说明一些,您可以选择最适合您的解决方案。
- 分类变量:我建议您在将数据框传递给
tbl_summary()
之前明确缺失值的因子水平。 NA 值将不再丢失,并且将像变量的任何其他级别一样被计入。 - 连续变量:使用
statistic=
参数报告缺失率。 - 所有变量:使用
add_n()
报告缺失率
library(gtsummary)
trial %>%
select(age, response, trt) %>%
# making the NA value explicit level of factor with `forcats::fct_explicit_na()`
dplyr::mutate(response = factor(response) %>% forcats::fct_explicit_na()) %>%
tbl_summary(
by = trt,
type = all_continuous() ~ "continuous2",
statistic = all_continuous() ~ c("{N_nonmiss}/{N_obs} {p_nonmiss}%",
"{median} ({p25}, {p75})")
) %>%
add_n(statistic = "{n} / {N}")
编辑:在原始发帖人的评论后添加更多示例。
library(gtsummary)
trial %>%
select(age, response, trt) %>%
# making the NA value explicit level of factor with `forcats::fct_explicit_na()`
dplyr::mutate(response = factor(response) %>% forcats::fct_explicit_na(na_level = "Unknown")) %>%
tbl_summary(
by = trt,
type = all_continuous() ~ "continuous2",
missing = "no",
statistic = all_continuous() ~ c("{median} ({p25}, {p75})",
"{N_miss} ({p_miss}%)")
) %>%
# udpating the Unknown label in the `.$table_body`
modify_table_body(
dplyr::mutate,
label = ifelse(label == "N missing (% missing)",
"Unknown",
label)
)