R中数据框中条件的Sumproduct
Sumproduct by condition in a data frame in R
考虑以下数据框:
df <- data.frame(row_id = c("r1","r2","r3","r4","r1","r2","r3","r4"),
v1 = c(3,2,5,2,5,2,6,4),
v2 = c(4,3,5,3,7,4,6,7))
我想用 "row_id" 求和积。也就是说,对于带有 row_id 的行:"r1" 我想进行以下计算:(3*4)+(5*7)。等等。
因此,我最终将得到以下矩阵:
df1 <- data.frame(row_id = c("r1","r2","r3","r4"),
v1 = c(47,14,61,34))
任何帮助将不胜感激。
谢谢。
library(dplyr)
df %>%
mutate(p = Reduce("*", .[-1])) %>%
group_by(row_id) %>%
summarise(v = sum(p))
或
tapply(Reduce("*", df[-1]), df$row_id, sum)
#r1 r2 r3 r4
#47 14 61 34
类似但略短:
dplyr::count(df, row_id, wt = v1*v2)
将 base R
与 split
和 %*%
结合使用
sapply(split(df[-1], df$row_id), function(x) x[,1] %*% x[,2])
# r1 r2 r3 r4
#47 14 61 34
或者另一个选项是 rowsum
来自 base R
rowsum(with(df, v1 * v2), group = df$row_id)
# [,1]
#r1 47
#r2 14
#r3 61
#r4 34
或使用data.table
library(data.table)
setDT(df)[, do.call(`%*%`, .SD), row_id]
# row_id V1
#1: r1 47
#2: r2 14
#3: r3 61
#4: r4 34
使用 base R,我们还可以 transform
然后 aggregate
aggregate(tot~row_id,transform(df,tot = v1*v2),sum)
row_id tot
1 r1 47
2 r2 14
3 r3 61
4 r4 34
或者你也可以这样做:
c(by(df[-1],df[1],do.call,what = "%*%"))
r1 r2 r3 r4
47 14 61 34
使用 dplyr
:
library(dplyr)
df %>% group_by(row_id) %>% summarize(sum(v1*v2))
# which gives:
# A tibble: 4 x 2
row_id `sum(v1 * v2)`
<fct> <dbl>
1 r1 47
2 r2 14
3 r3 61
4 r4 34
考虑以下数据框:
df <- data.frame(row_id = c("r1","r2","r3","r4","r1","r2","r3","r4"),
v1 = c(3,2,5,2,5,2,6,4),
v2 = c(4,3,5,3,7,4,6,7))
我想用 "row_id" 求和积。也就是说,对于带有 row_id 的行:"r1" 我想进行以下计算:(3*4)+(5*7)。等等。
因此,我最终将得到以下矩阵:
df1 <- data.frame(row_id = c("r1","r2","r3","r4"),
v1 = c(47,14,61,34))
任何帮助将不胜感激。
谢谢。
library(dplyr)
df %>%
mutate(p = Reduce("*", .[-1])) %>%
group_by(row_id) %>%
summarise(v = sum(p))
或
tapply(Reduce("*", df[-1]), df$row_id, sum)
#r1 r2 r3 r4
#47 14 61 34
类似但略短:
dplyr::count(df, row_id, wt = v1*v2)
将 base R
与 split
和 %*%
sapply(split(df[-1], df$row_id), function(x) x[,1] %*% x[,2])
# r1 r2 r3 r4
#47 14 61 34
或者另一个选项是 rowsum
来自 base R
rowsum(with(df, v1 * v2), group = df$row_id)
# [,1]
#r1 47
#r2 14
#r3 61
#r4 34
或使用data.table
library(data.table)
setDT(df)[, do.call(`%*%`, .SD), row_id]
# row_id V1
#1: r1 47
#2: r2 14
#3: r3 61
#4: r4 34
使用 base R,我们还可以 transform
然后 aggregate
aggregate(tot~row_id,transform(df,tot = v1*v2),sum)
row_id tot
1 r1 47
2 r2 14
3 r3 61
4 r4 34
或者你也可以这样做:
c(by(df[-1],df[1],do.call,what = "%*%"))
r1 r2 r3 r4
47 14 61 34
使用 dplyr
:
library(dplyr)
df %>% group_by(row_id) %>% summarize(sum(v1*v2))
# which gives:
# A tibble: 4 x 2
row_id `sum(v1 * v2)`
<fct> <dbl>
1 r1 47
2 r2 14
3 r3 61
4 r4 34