R data.table: 重用一个聚合
R data.table: reuse an aggregation
我想对多个数据表应用相同的聚合,而不重写聚合方案。
考虑
dt1 <- data.table(id = c(1,2), a = rnorm(10), b = rnorm(10), c = rnorm(10))
dt2 <- data.table(id = c(1,2), a = rnorm(10), b = rnorm(10), c = rnorm(10))
dt1_aggregates <- dt1[, .(mean_a=mean(a), sd_a=sd(a), mean_b=mean(b), sd_b=sd(b)), by=id]
dt2_aggregates <- dt2[, .(mean_a=mean(a), sd_a=sd(a), mean_b=mean(b), sd_b=sd(b)), by=id]
有没有什么方法可以为 dt2 重用 dt1_aggregates 聚合方案而不必将其写出两次?
您可以引用您想要的表达式,然后在 data.table:
中计算它
my.call=quote(list(mean_a=mean(a), sd_a=sd(a), mean_b=mean(b), sd_b=sd(b)))
dt1[, eval(my.call), by=id]
生产
id mean_a sd_a mean_b sd_b
1: 1 0.004165423 0.7504691 -0.05001424 1.4440434
2: 2 -0.430910188 0.9648096 0.26918995 0.8680997
和
dt2[, eval(my.call), by=id]
生产
id mean_a sd_a mean_b sd_b
1: 1 0.2974145 1.191863 -0.0588854 0.7896988
2: 2 -0.4642856 1.438937 0.3612607 1.0581702
我想对多个数据表应用相同的聚合,而不重写聚合方案。
考虑
dt1 <- data.table(id = c(1,2), a = rnorm(10), b = rnorm(10), c = rnorm(10))
dt2 <- data.table(id = c(1,2), a = rnorm(10), b = rnorm(10), c = rnorm(10))
dt1_aggregates <- dt1[, .(mean_a=mean(a), sd_a=sd(a), mean_b=mean(b), sd_b=sd(b)), by=id]
dt2_aggregates <- dt2[, .(mean_a=mean(a), sd_a=sd(a), mean_b=mean(b), sd_b=sd(b)), by=id]
有没有什么方法可以为 dt2 重用 dt1_aggregates 聚合方案而不必将其写出两次?
您可以引用您想要的表达式,然后在 data.table:
中计算它my.call=quote(list(mean_a=mean(a), sd_a=sd(a), mean_b=mean(b), sd_b=sd(b)))
dt1[, eval(my.call), by=id]
生产
id mean_a sd_a mean_b sd_b
1: 1 0.004165423 0.7504691 -0.05001424 1.4440434
2: 2 -0.430910188 0.9648096 0.26918995 0.8680997
和
dt2[, eval(my.call), by=id]
生产
id mean_a sd_a mean_b sd_b
1: 1 0.2974145 1.191863 -0.0588854 0.7896988
2: 2 -0.4642856 1.438937 0.3612607 1.0581702