取R中多次分组后的百分比(dplyr)
Take the percentage after multiple grouping in R (dplyr)
我在 R 中有以下 table:
id
var1
var2
value
ID1
A
X
1
ID2
B
X
2
ID3
C
X
3
ID4
D
X
4
ID5
A
Y
2
ID6
C
Y
5
ID7
B
Y
3
然后在 dplyr 中 group_by var1 和 var2 并取每个分组的比例结果为:
id
var1
var2
value
ID1
A
X
1/3
ID2
A
Y
2/3
ID3
C
X
3/8
ID4
C
Y
5/8
ID5
B
X
2/5
ID6
B
Y
3/5
ID7
D
X
1
我试过了:
id = c("ID1","ID2","ID3","ID4","ID5","ID6","ID7")
var1 = c("A","B","C","D","A","C","B")
var2 = c(rep("X",4),rep("Y",3))
value = c(1,2,3,4,2,5,3)
data = data.frame(id,var1,var2,value);data
library(dplyr)
data%>%
group_by(var1,var2)%>%
summarise(prop = sum(value))
但它只对 var1 和 var2 进行分组。
有帮助吗?
这可能有效
library(dplyr)
data %>%
group_by(var1)%>%
mutate(value = value/sum(value)) %>%
arrange(var1, var2)
id var1 var2 value
<chr> <chr> <chr> <dbl>
1 ID1 A X 0.333
2 ID5 A Y 0.667
3 ID2 B X 0.4
4 ID7 B Y 0.6
5 ID3 C X 0.375
6 ID6 C Y 0.625
7 ID4 D X 1
data.table
library(data.table)
setDT(df)[, res := proportions(value), by = var1][order(var1)]
基础
df$res <- ave(df$value, list(df$var1), FUN = proportions)
我在 R 中有以下 table:
id | var1 | var2 | value |
---|---|---|---|
ID1 | A | X | 1 |
ID2 | B | X | 2 |
ID3 | C | X | 3 |
ID4 | D | X | 4 |
ID5 | A | Y | 2 |
ID6 | C | Y | 5 |
ID7 | B | Y | 3 |
然后在 dplyr 中 group_by var1 和 var2 并取每个分组的比例结果为:
id | var1 | var2 | value |
---|---|---|---|
ID1 | A | X | 1/3 |
ID2 | A | Y | 2/3 |
ID3 | C | X | 3/8 |
ID4 | C | Y | 5/8 |
ID5 | B | X | 2/5 |
ID6 | B | Y | 3/5 |
ID7 | D | X | 1 |
我试过了:
id = c("ID1","ID2","ID3","ID4","ID5","ID6","ID7")
var1 = c("A","B","C","D","A","C","B")
var2 = c(rep("X",4),rep("Y",3))
value = c(1,2,3,4,2,5,3)
data = data.frame(id,var1,var2,value);data
library(dplyr)
data%>%
group_by(var1,var2)%>%
summarise(prop = sum(value))
但它只对 var1 和 var2 进行分组。 有帮助吗?
这可能有效
library(dplyr)
data %>%
group_by(var1)%>%
mutate(value = value/sum(value)) %>%
arrange(var1, var2)
id var1 var2 value
<chr> <chr> <chr> <dbl>
1 ID1 A X 0.333
2 ID5 A Y 0.667
3 ID2 B X 0.4
4 ID7 B Y 0.6
5 ID3 C X 0.375
6 ID6 C Y 0.625
7 ID4 D X 1
data.table
library(data.table)
setDT(df)[, res := proportions(value), by = var1][order(var1)]
基础
df$res <- ave(df$value, list(df$var1), FUN = proportions)