为字符变量创建标签列
Creating a Label column for character variables
假设我的这个示例数据集只有字符变量。
dxe1<-c("W07XXXA", "NULL", "3")
dxe1_poa<-c("Y","NULL","N")
dxe2<-c("NULL","NULL","NULL")
dxe2_poa<-c("NULL","NULL","NULL")
df3 <- data.frame(dxe1,dxe1_poa, dxe2,dxe2_poa)
我想给变量贴上标签,所以我为它们创建了一个标签向量:
var.labels = c(dxe1="External Cause of Injury Diagnosis 1",
dxe1_poa="External Cause of Injury Diagnosis 1 - Present on Admission", dxe2="External Cause of Injury Diagnosis 2", dxe2_poa="External Cause of Injury Diagnosis 2 - Present on Admission")
label(df3) = as.list(var.labels[match(names(df3), names(var.labels))])
label(df3)
我的目标是创建一个类似于下面的table,也就是说,我想要一个标签列来给出变量描述。我只希望将缺失的观察结果显示为统计数据,而不是最小值、最大值、平均值、标准差。他们应该只是 n.a。就像下面的 table。
我正在尝试使用以下一组代码:
df3 <- Filter(is.character, df3)
Variables <- names(df3)
Label <- label(df3)
Missing <- sapply(df3, function(x) sum(is.na(x)))
Type <- sapply(df3, function(x) {tmp <- class(x);if(length(x) > 1) tmp[2] else tmp[1]})
Min <- sapply(df3, function(x) min(x, na.rm = TRUE))
Max <- sapply(df3, function(x) max(x, na.rm = TRUE))
SD <- sapply(df3, function(x) format(round(sd(x, na.rm=TRUE), 2), nsmall = 2))
Mean <- sapply(df3, function(x) format(round(mean(x, na.rm=TRUE), 2), nsmall = 2))
#To get the Latex table for the rows
knitr::kable(data.frame(Variables, Label, Missing, Type, Min, Max, Mean, SD, row.names = NULL), "latex")
但是上面这组代码,还是显示了mean和SD的统计。我想让它们像上面的 table 一样显示为“n.a”。有什么建议么?另外,我在字符形式中出现了最小值和最大值。我只想显示数字形式。
你可以试试这个:
df3 <- Filter(is.character, df3)
Variables <- names(df3)
Label <- label(df3)
Missing <- sapply(df3, function(x) sum(is.na(x)))
Type <- sapply(df3, function(x) {tmp <- class(x);if(length(x) > 1) tmp[2] else tmp[1]})
Min <- 'n.a'
Max <- 'n.a'
SD <- 'n.a'
Mean <- 'n.a'
#To get the Latex table for the rows
knitr::kable(data.frame(Variables, Label, Missing, Type, Min, Max, Mean, SD, row.names = NULL), "latex")
假设我的这个示例数据集只有字符变量。
dxe1<-c("W07XXXA", "NULL", "3")
dxe1_poa<-c("Y","NULL","N")
dxe2<-c("NULL","NULL","NULL")
dxe2_poa<-c("NULL","NULL","NULL")
df3 <- data.frame(dxe1,dxe1_poa, dxe2,dxe2_poa)
我想给变量贴上标签,所以我为它们创建了一个标签向量:
var.labels = c(dxe1="External Cause of Injury Diagnosis 1",
dxe1_poa="External Cause of Injury Diagnosis 1 - Present on Admission", dxe2="External Cause of Injury Diagnosis 2", dxe2_poa="External Cause of Injury Diagnosis 2 - Present on Admission")
label(df3) = as.list(var.labels[match(names(df3), names(var.labels))])
label(df3)
我的目标是创建一个类似于下面的table,也就是说,我想要一个标签列来给出变量描述。我只希望将缺失的观察结果显示为统计数据,而不是最小值、最大值、平均值、标准差。他们应该只是 n.a。就像下面的 table。
我正在尝试使用以下一组代码:
df3 <- Filter(is.character, df3)
Variables <- names(df3)
Label <- label(df3)
Missing <- sapply(df3, function(x) sum(is.na(x)))
Type <- sapply(df3, function(x) {tmp <- class(x);if(length(x) > 1) tmp[2] else tmp[1]})
Min <- sapply(df3, function(x) min(x, na.rm = TRUE))
Max <- sapply(df3, function(x) max(x, na.rm = TRUE))
SD <- sapply(df3, function(x) format(round(sd(x, na.rm=TRUE), 2), nsmall = 2))
Mean <- sapply(df3, function(x) format(round(mean(x, na.rm=TRUE), 2), nsmall = 2))
#To get the Latex table for the rows
knitr::kable(data.frame(Variables, Label, Missing, Type, Min, Max, Mean, SD, row.names = NULL), "latex")
但是上面这组代码,还是显示了mean和SD的统计。我想让它们像上面的 table 一样显示为“n.a”。有什么建议么?另外,我在字符形式中出现了最小值和最大值。我只想显示数字形式。
你可以试试这个:
df3 <- Filter(is.character, df3)
Variables <- names(df3)
Label <- label(df3)
Missing <- sapply(df3, function(x) sum(is.na(x)))
Type <- sapply(df3, function(x) {tmp <- class(x);if(length(x) > 1) tmp[2] else tmp[1]})
Min <- 'n.a'
Max <- 'n.a'
SD <- 'n.a'
Mean <- 'n.a'
#To get the Latex table for the rows
knitr::kable(data.frame(Variables, Label, Missing, Type, Min, Max, Mean, SD, row.names = NULL), "latex")