如何使用 R 一次计算两个变量的百分比
How to calculate the percentage for two variables at once using R
我有这样的数据例子
dt=structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L), ae = c("increase in lymphocytes", "increase in lymphocytes",
"increase in abs. lymphocytes", "increase in lymphocytes", "decrease in abs. neutrophils",
"decrease in neutrophils", "decrease in abs. Monocytes", "decrease in monocytes",
"increase in lymphocytes", "decrease in hemoglobin", "decrease in neutrophils",
"decrease in abs. monocytes", "increase in lymphocytes"), link = c("Connected",
"Connected", "Connected", "Connected", "Connected", "Connected",
"Not connected", "Not connected", "Connected", "Not connected",
"Connected", "Not connected", "Connected")), class = "data.frame", row.names = c(NA,
-13L))
我需要计算 ae
和 link
两列的百分比。
我试试看。
dt <- dt[,
.(n_gr1 = .SD[group == 1, .N],
n_gr2 = .SD[group == 2, .N],
size_gr1 = 19,
size_gr2 = 19),
by = c("ae","link")
]
并得到不需要的结果
ae link n_gr1 n_gr2
1: increase in lymphocytes Connected 3 2
2: increase in abs. lymphocytes Connected 1 0
3: decrease in abs. neutrophils Connected 1 0
4: decrease in neutrophils Connected 1 1
5: decrease in abs. Monocytes Not connected 1 0
6: decrease in monocytes Not connected 1 0
7: decrease in hemoglobin Not connected 0 1
8: decrease in abs. monocytes Not connected 0 1
size_gr1 size_gr2
1: 19 19
2: 19 19
3: 19 19
4: 19 19
5: 19 19
6: 19 19
7: 19 19
8: 19 19
我需要计算组中人数的百分比(size_gr1 and size_gr2)。例如像这样(小数点后两位)。
ae link n_gr1 n_gr2
1: increase in lymphocytes Connected 3(15,79%) 2(10,53%)
3/19*100=15,79%
2/19*100=10,53%
我怎样才能得到想要的结果。
谢谢。
我不确定我是否真的得到了你想要的,但是这个怎么样:
dt[, perc_gp1:= round(n_gr1/size_gr1*100, 2)]
dt[, perc_gp2:= round(n_gr2/size_gr2*100, 2)]
当然,这种方法不能很好地扩展,所以如果您需要,请告诉我
我有点搞不清楚到底在问什么,但我相信如果你想获得类别变量占总数的百分比,那么使用 tidyvere 和 janitor 包你可以执行以下操作:
dt %>% count(group,ae) %>% #group by the grouping variables
mutate(ae_per=n/sum(n)) %>% #this takes the percentage of each ae category of the total (not subtotal)
janitor::adorn_totals() #adds a total at the bottom
此代码产生以下输出
group ae n ae_per
1 decrease in abs. Monocytes 1 0.07692308
1 decrease in abs. neutrophils 1 0.07692308
1 decrease in monocytes 1 0.07692308
1 decrease in neutrophils 1 0.07692308
1 increase in abs. lymphocytes 1 0.07692308
1 increase in lymphocytes 3 0.23076923
2 decrease in abs. monocytes 1 0.07692308
2 decrease in hemoglobin 1 0.07692308
2 decrease in neutrophils 1 0.07692308
2 increase in lymphocytes 2 0.15384615
Total - 13 1.00000000
我有这样的数据例子
dt=structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L), ae = c("increase in lymphocytes", "increase in lymphocytes",
"increase in abs. lymphocytes", "increase in lymphocytes", "decrease in abs. neutrophils",
"decrease in neutrophils", "decrease in abs. Monocytes", "decrease in monocytes",
"increase in lymphocytes", "decrease in hemoglobin", "decrease in neutrophils",
"decrease in abs. monocytes", "increase in lymphocytes"), link = c("Connected",
"Connected", "Connected", "Connected", "Connected", "Connected",
"Not connected", "Not connected", "Connected", "Not connected",
"Connected", "Not connected", "Connected")), class = "data.frame", row.names = c(NA,
-13L))
我需要计算 ae
和 link
两列的百分比。
我试试看。
dt <- dt[,
.(n_gr1 = .SD[group == 1, .N],
n_gr2 = .SD[group == 2, .N],
size_gr1 = 19,
size_gr2 = 19),
by = c("ae","link")
]
并得到不需要的结果
ae link n_gr1 n_gr2
1: increase in lymphocytes Connected 3 2
2: increase in abs. lymphocytes Connected 1 0
3: decrease in abs. neutrophils Connected 1 0
4: decrease in neutrophils Connected 1 1
5: decrease in abs. Monocytes Not connected 1 0
6: decrease in monocytes Not connected 1 0
7: decrease in hemoglobin Not connected 0 1
8: decrease in abs. monocytes Not connected 0 1
size_gr1 size_gr2
1: 19 19
2: 19 19
3: 19 19
4: 19 19
5: 19 19
6: 19 19
7: 19 19
8: 19 19
我需要计算组中人数的百分比(size_gr1 and size_gr2)。例如像这样(小数点后两位)。
ae link n_gr1 n_gr2
1: increase in lymphocytes Connected 3(15,79%) 2(10,53%)
3/19*100=15,79%
2/19*100=10,53%
我怎样才能得到想要的结果。 谢谢。
我不确定我是否真的得到了你想要的,但是这个怎么样:
dt[, perc_gp1:= round(n_gr1/size_gr1*100, 2)]
dt[, perc_gp2:= round(n_gr2/size_gr2*100, 2)]
当然,这种方法不能很好地扩展,所以如果您需要,请告诉我
我有点搞不清楚到底在问什么,但我相信如果你想获得类别变量占总数的百分比,那么使用 tidyvere 和 janitor 包你可以执行以下操作:
dt %>% count(group,ae) %>% #group by the grouping variables
mutate(ae_per=n/sum(n)) %>% #this takes the percentage of each ae category of the total (not subtotal)
janitor::adorn_totals() #adds a total at the bottom
此代码产生以下输出
group ae n ae_per
1 decrease in abs. Monocytes 1 0.07692308
1 decrease in abs. neutrophils 1 0.07692308
1 decrease in monocytes 1 0.07692308
1 decrease in neutrophils 1 0.07692308
1 increase in abs. lymphocytes 1 0.07692308
1 increase in lymphocytes 3 0.23076923
2 decrease in abs. monocytes 1 0.07692308
2 decrease in hemoglobin 1 0.07692308
2 decrease in neutrophils 1 0.07692308
2 increase in lymphocytes 2 0.15384615
Total - 13 1.00000000