R 中按 GroupID 分组的值出现次数
Number of value occurences grouped by GroupID in R
我有一个数据集,它有多个列,每列有多个值。我想要的是将每列中的每个值的计数按 groupID
分组
例子
GroupId | C1 | C2
1 | "valColOne1" | "valColTwo2"
2 | "valColOne1" | "valColTwo2"
2 | "valColOne1" | "valColTwo2"
2 | "valColOne2" | "valColTwo1"
1 | "valColOne1" | "valColTwo1"
结果应该是
GroupId | valColOne1 | valColOne2 | valColTwo1 | valColTwo2
1 | 2 | 0 | 1 | 1
2 | 2 | 1 | 1 | 2
要提到初始 table 中的所有值都将是字符串。
将您的原始数据框(我称之为 dat
)和 melt
转换为长格式。然后用dcast
统计每个值出现的次数。
library(reshape2)
dat.m = melt(dat, id.var="GroupId")
dcast(dat.m, GroupId ~ value)
GroupId valColOne1 valColOne2 valColTwo1 valColTwo2
1 1 2 0 1 1
2 2 2 1 1 2
您可以使用 base R
中的 table
table(data.frame(GroupId= df1$GroupId, Val=unlist(df1[-1])))
# Val
# GroupId valColOne1 valColOne2 valColTwo1 valColTwo2
# 1 2 0 1 1
# 2 2 1 1 2
数据
df1 <- structure(list(GroupId = c(1, 2, 2, 2, 1), C1 = c("valColOne1",
"valColOne1", "valColOne1", "valColOne2", "valColOne1"),
C2 = c("valColTwo2",
"valColTwo2", "valColTwo2", "valColTwo1", "valColTwo1")),
.Names = c("GroupId",
"C1", "C2"), row.names = c(NA, -5L), class = "data.frame")
我有一个数据集,它有多个列,每列有多个值。我想要的是将每列中的每个值的计数按 groupID
分组例子
GroupId | C1 | C2
1 | "valColOne1" | "valColTwo2"
2 | "valColOne1" | "valColTwo2"
2 | "valColOne1" | "valColTwo2"
2 | "valColOne2" | "valColTwo1"
1 | "valColOne1" | "valColTwo1"
结果应该是
GroupId | valColOne1 | valColOne2 | valColTwo1 | valColTwo2
1 | 2 | 0 | 1 | 1
2 | 2 | 1 | 1 | 2
要提到初始 table 中的所有值都将是字符串。
将您的原始数据框(我称之为 dat
)和 melt
转换为长格式。然后用dcast
统计每个值出现的次数。
library(reshape2)
dat.m = melt(dat, id.var="GroupId")
dcast(dat.m, GroupId ~ value)
GroupId valColOne1 valColOne2 valColTwo1 valColTwo2
1 1 2 0 1 1
2 2 2 1 1 2
您可以使用 base R
table
table(data.frame(GroupId= df1$GroupId, Val=unlist(df1[-1])))
# Val
# GroupId valColOne1 valColOne2 valColTwo1 valColTwo2
# 1 2 0 1 1
# 2 2 1 1 2
数据
df1 <- structure(list(GroupId = c(1, 2, 2, 2, 1), C1 = c("valColOne1",
"valColOne1", "valColOne1", "valColOne2", "valColOne1"),
C2 = c("valColTwo2",
"valColTwo2", "valColTwo2", "valColTwo1", "valColTwo1")),
.Names = c("GroupId",
"C1", "C2"), row.names = c(NA, -5L), class = "data.frame")