在 qwraps2 包中使用 summary_table 时如何跳过 NA 值?
How to skip NA values when using summary_table with qwraps2 package?
我正在尝试使用 qwraps2 制作基线特征表。我的数据是:
> str(joined_df2)
'data.frame': 259 obs. of 23 variables:
$ SUBJID : chr "S001011" "S001013" "S001016" "S001017" ...
$ AGE : num 72 74 65 46 59 71 71 64 63 58 ...
$ AGEU : chr "YEARS" "YEARS" "YEARS" "YEARS" ...
$ FASFL.x : chr "Y" "Y" "Y" "Y" ...
$ SAFFL : chr "Y" "Y" "Y" "Y" ...
$ TRT01P : chr "Treatment B" "Treatment A" "Treatment B" "Treatment B" ...
$ HGTBL : num 1.68 1.57 1.73 1.8 1.78 ...
$ HGTBLU : chr "m" "m" "m" "m" ...
$ WGTBL : num 224 187 70.7 123.9 70.9 ...
$ WGTBLU : chr "lb" "lb" "kg" "kg" ...
$ DIABDUR : num 8 22 20 6 9 7 12 12 6 5 ...
$ DIABDURU: chr "years" "years" "years" "years" ...
$ FASFL.y : chr "Y" "Y" "Y" "Y" ...
$ TRTP : chr "Treatment B" "Treatment A" "Treatment B" "Treatment B" ...
$ AVISIT : chr "Visit 10 (Week 0)" "Visit 10 (Week 0)" "Visit 10 (Week 0)" "Visit 10 (Week 0)" ...
$ VISITNUM: num 10 10 10 10 10 10 10 10 10 10 ...
$ PARAM : chr "HbA1c Blood (%)" "HbA1c Blood (%)" "HbA1c Blood (%)" "HbA1c Blood (%)" ...
$ PARAMCD : chr "C64849B" "C64849B" "C64849B" "C64849B" ...
$ AVAL : num 8.6 8.4 7 7.3 8.2 7.7 7.3 8.8 7.3 8.4 ...
$ AVALU : chr "%" "%" "%" "%" ...
$ ANL01FL : chr "Y" "Y" "Y" "Y" ...
$ ANL01REA: chr NA NA NA NA ...
$ TRTP2 : chr "Treatment B" "Treatment A" "Treatment B" "Treatment B" ...
我想包括变量 AGE 的均值 (SD)、中值、最小值和最大值,并将其按 TRTP2 分组。变量 AGE 包含两个 NA 值:
> joined_df2[is.na(joined_df2$ AGE),]
SUBJID AGE AGEU FASFL.x SAFFL TRT01P HGTBL HGTBLU WGTBL WGTBLU DIABDUR DIABDURU FASFL.y TRTP AVISIT VISITNUM PARAM PARAMCD AVAL AVALU ANL01FL ANL01REA TRTP2
18 S001054 NA <NA> <NA> <NA> <NA> NA <NA> NA <NA> NA <NA> Y Treatment A Visit 10 (Week 0) 10 HbA1c Blood (%) C64849B 8.4 % Y <NA> Treatment A
146 S051018 NA <NA> <NA> <NA> <NA> NA <NA> NA <NA> NA <NA> Y Treatment A Visit 10 (Week 0) 10 HbA1c Blood (%) C64849B 7.4 % Y <NA> Treatment A
当我运行一个代码时,我得到一个错误:
> library(qwraps2)
> options(qwraps2_markup = 'markdown') # default is latex
> joined_df2_summaries <-
+ list("Age (yrs)" =
+ list(
+ "Mean (SD)" = ~ qwraps2::mean_sd(AGE, denote_sd = "paren"),
+ "Median" = ~ qwraps2::median_iqr(AGE),
+ "Min:" = ~ min(AGE),
+ "Max:" = ~ max(AGE)))
> summary_table(dplyr::group_by(joined_df2, TRTP2), joined_df2_summaries)
Error in quantile.default(x, probs = c(1, 3)/4, na.rm = na_rm) :
missing values and NaN's not allowed if 'na.rm' is FALSE
我尝试在里面使用 na.rm=TRUE,但没有帮助:
> joined_df2_summaries <-
+ list("Age (yrs)" =
+ list(
+ "Mean (SD)" = ~ qwraps2::mean_sd(AGE, denote_sd = "paren", na.rm=TRUE),
+ "Median" = ~ qwraps2::median_iqr(AGE, na.rm=TRUE),
+ "Min:" = ~ min(AGE, na.rm=TRUE),
+ "Max:" = ~ max(AGE, na.rm=TRUE)))
> summary_table(dplyr::group_by(joined_df2, TRTP2), joined_df2_summaries)
Error in qwraps2::mean_sd(AGE, denote_sd = "paren", na.rm = TRUE) :
unused argument (na.rm = TRUE)
如何计算不包括 NA 值的 AGE 的平均值等?
我会用 expss
来解决这个问题。您可以轻松地按分类变量分组并使用 expss
获取摘要统计信息。例如:
mtcars %>% expss::tab_cells(mpg,hp,qsec) %>%
tab_cols(gear) %>% # we will make rows with `gear` with transpose command later
tab_stat_fun("My Mean Label"=w_mean,
w_sd,
w_min,
w_max, method='list', label = "|") %>%
tab_pivot() %>%
tab_transpose() %>% # take the result and flip it
htmlTable()
在 qwraps2::mean_sd
和 qwraps2::median_iqr
中忽略缺失值的论据是 而不是 na.rm
它是 na_rm
。试试这个:
joined_df2_summaries <-
list("Age (yrs)" =
list(
"Mean (SD)" = ~ qwraps2::mean_sd(AGE, na_rm = TRUE, denote_sd = "paren"),
"Median" = ~ qwraps2::median_iqr(AGE, na_rm = TRUE),
"Min:" = ~ min(AGE),
"Max:" = ~ max(AGE)))
summary_table(joined_df2, summaries = joined_df2_summaries, by = "TRTP2")
我正在尝试使用 qwraps2 制作基线特征表。我的数据是:
> str(joined_df2)
'data.frame': 259 obs. of 23 variables:
$ SUBJID : chr "S001011" "S001013" "S001016" "S001017" ...
$ AGE : num 72 74 65 46 59 71 71 64 63 58 ...
$ AGEU : chr "YEARS" "YEARS" "YEARS" "YEARS" ...
$ FASFL.x : chr "Y" "Y" "Y" "Y" ...
$ SAFFL : chr "Y" "Y" "Y" "Y" ...
$ TRT01P : chr "Treatment B" "Treatment A" "Treatment B" "Treatment B" ...
$ HGTBL : num 1.68 1.57 1.73 1.8 1.78 ...
$ HGTBLU : chr "m" "m" "m" "m" ...
$ WGTBL : num 224 187 70.7 123.9 70.9 ...
$ WGTBLU : chr "lb" "lb" "kg" "kg" ...
$ DIABDUR : num 8 22 20 6 9 7 12 12 6 5 ...
$ DIABDURU: chr "years" "years" "years" "years" ...
$ FASFL.y : chr "Y" "Y" "Y" "Y" ...
$ TRTP : chr "Treatment B" "Treatment A" "Treatment B" "Treatment B" ...
$ AVISIT : chr "Visit 10 (Week 0)" "Visit 10 (Week 0)" "Visit 10 (Week 0)" "Visit 10 (Week 0)" ...
$ VISITNUM: num 10 10 10 10 10 10 10 10 10 10 ...
$ PARAM : chr "HbA1c Blood (%)" "HbA1c Blood (%)" "HbA1c Blood (%)" "HbA1c Blood (%)" ...
$ PARAMCD : chr "C64849B" "C64849B" "C64849B" "C64849B" ...
$ AVAL : num 8.6 8.4 7 7.3 8.2 7.7 7.3 8.8 7.3 8.4 ...
$ AVALU : chr "%" "%" "%" "%" ...
$ ANL01FL : chr "Y" "Y" "Y" "Y" ...
$ ANL01REA: chr NA NA NA NA ...
$ TRTP2 : chr "Treatment B" "Treatment A" "Treatment B" "Treatment B" ...
我想包括变量 AGE 的均值 (SD)、中值、最小值和最大值,并将其按 TRTP2 分组。变量 AGE 包含两个 NA 值:
> joined_df2[is.na(joined_df2$ AGE),]
SUBJID AGE AGEU FASFL.x SAFFL TRT01P HGTBL HGTBLU WGTBL WGTBLU DIABDUR DIABDURU FASFL.y TRTP AVISIT VISITNUM PARAM PARAMCD AVAL AVALU ANL01FL ANL01REA TRTP2
18 S001054 NA <NA> <NA> <NA> <NA> NA <NA> NA <NA> NA <NA> Y Treatment A Visit 10 (Week 0) 10 HbA1c Blood (%) C64849B 8.4 % Y <NA> Treatment A
146 S051018 NA <NA> <NA> <NA> <NA> NA <NA> NA <NA> NA <NA> Y Treatment A Visit 10 (Week 0) 10 HbA1c Blood (%) C64849B 7.4 % Y <NA> Treatment A
当我运行一个代码时,我得到一个错误:
> library(qwraps2)
> options(qwraps2_markup = 'markdown') # default is latex
> joined_df2_summaries <-
+ list("Age (yrs)" =
+ list(
+ "Mean (SD)" = ~ qwraps2::mean_sd(AGE, denote_sd = "paren"),
+ "Median" = ~ qwraps2::median_iqr(AGE),
+ "Min:" = ~ min(AGE),
+ "Max:" = ~ max(AGE)))
> summary_table(dplyr::group_by(joined_df2, TRTP2), joined_df2_summaries)
Error in quantile.default(x, probs = c(1, 3)/4, na.rm = na_rm) :
missing values and NaN's not allowed if 'na.rm' is FALSE
我尝试在里面使用 na.rm=TRUE,但没有帮助:
> joined_df2_summaries <-
+ list("Age (yrs)" =
+ list(
+ "Mean (SD)" = ~ qwraps2::mean_sd(AGE, denote_sd = "paren", na.rm=TRUE),
+ "Median" = ~ qwraps2::median_iqr(AGE, na.rm=TRUE),
+ "Min:" = ~ min(AGE, na.rm=TRUE),
+ "Max:" = ~ max(AGE, na.rm=TRUE)))
> summary_table(dplyr::group_by(joined_df2, TRTP2), joined_df2_summaries)
Error in qwraps2::mean_sd(AGE, denote_sd = "paren", na.rm = TRUE) :
unused argument (na.rm = TRUE)
如何计算不包括 NA 值的 AGE 的平均值等?
我会用 expss
来解决这个问题。您可以轻松地按分类变量分组并使用 expss
获取摘要统计信息。例如:
mtcars %>% expss::tab_cells(mpg,hp,qsec) %>%
tab_cols(gear) %>% # we will make rows with `gear` with transpose command later
tab_stat_fun("My Mean Label"=w_mean,
w_sd,
w_min,
w_max, method='list', label = "|") %>%
tab_pivot() %>%
tab_transpose() %>% # take the result and flip it
htmlTable()
在 qwraps2::mean_sd
和 qwraps2::median_iqr
中忽略缺失值的论据是 而不是 na.rm
它是 na_rm
。试试这个:
joined_df2_summaries <-
list("Age (yrs)" =
list(
"Mean (SD)" = ~ qwraps2::mean_sd(AGE, na_rm = TRUE, denote_sd = "paren"),
"Median" = ~ qwraps2::median_iqr(AGE, na_rm = TRUE),
"Min:" = ~ min(AGE),
"Max:" = ~ max(AGE)))
summary_table(joined_df2, summaries = joined_df2_summaries, by = "TRTP2")