调查分析中的 gtsummary(使用调查的子集功能时标签会丢失)

gtsummary in Survey analysis (labels get lost when using subset function of survey)

变量标签(使用 Labeled 包)在使用(Survey 包的子集)对调查进行子集化时不会保留,我最终不得不手动将标签插入 gtsummary 函数。

library(dplyr)
library(survey)
library(gtsummary)
library(labelled)

#Reading the CSV file (please download a sample dataframe from link below) 
df <- read.csv("nis_2.csv")

#Change to factors 
names <- c("htn", "dm", "FEMALE")
df[, names] <- lapply(included_df[, names], factor)

#Changing labels 
var_label(df$AGE) <- "Age"
var_label(df$FEMALE) <- "Gender (Female)"
var_label(df$dm) <- "Diabetes"
var_label(df$htn) <- "Hypertension"

#declare survey design 
dstr <- svydesign(
 id = ~HOSP_NIS, 
 strata = ~NIS_STRATUM, 
 weights = ~DISCWT, 
 nest=TRUE, 
 survey.lonely.psu = "adjust",  
 data = df)

#subset the data to include our UGIB cirrhotics 
small_set <- subset(dstr, (htn == 1))
summary(small_set)

small_set %>%
  tbl_svysummary(
   by=dm,
   include = c(AGE, FEMALE),
   missing = "no", 
   statistic = all_continuous() ~ "{mean} ({sd})"
  ) %>%
  add_p() %>%
  add_overall() %>%
  modify_caption("**Table 1. Patient Characteristics**") %>%
  modify_spanning_header(c("stat_1", "stat_2") ~ "**History of Diabetes**")

示例数据库位于:https://github.com/Dr-Kaboum/nis_gt_summary/blob/16909872624714d1feb30bd501a6204aba947de7/nis_2.csv

subset 删除标记的属性,首先对数据进行子集化,标记它,然后传递给 gtsummary

#example of the label being removed. 
library(labelled)


var_label(mtcars$mpg) <- "Mile per gallon"

mt2 <- subset(mtcars, cyl == 4)
var_label(mt2$mpg) <- "Mile per gallon" #need to relabel

使用 var_label() 对调查对象的子集进行编辑

我无法访问您的数据,但使用示例集,我表明您可以在访问调查对象列表的 variables 部分时重新标记数据。如果您在此处添加标签,它将显示在 table.

# A dataset with a complex design
library(gtsummary)
data(api, package = "survey")
labelled::var_label(apiclus1$api99) <- "API 99"

survdat <-
  survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) 
#this subset will remove labels
survdat2 <- subset(survdat, (cname == "Fresno"))
#relabel here after subset within survey object
labelled::var_label(survdat2$variables$api99) <- "API 99"

#make table with label
ex<-  tbl_svysummary(data = survdat2,by = "both", include = c(cname, api00, api99, both))