如何进行汇总，然后将结果乘以组？

Question

这是我的数据框。 Country1 代表居住在德国的人，Country 2 代表他们在搬到 Country1 之前居住过 5 年的国家。

Country1	Country2	Weight	obs
Germany	Germany	4	1
Germany	Germany	119	2
France	Germany	3	3
France	Germany	2	4
Italy	France	1	5

基本上我想要的是总结每个组合的列权重并乘以观察值（由 obs 列表示。例如，在第一行中我有德国到德国的组合所以什么我想要的是对 Weight (119+4=123) 列的权重求和，然后将此总和 (123* 1=123) 的结果乘以 Obs (1) 列（在第一行中）的相应观察值. 对于第二行，德国的权重汇总是 (119+4=123)，在这种情况下，这个结果必须乘以这一行的观察结果 (123* 2=246)。第三行，权重总和为 (3+2=5)，然后将此结果乘以该行的观察值 (5* 3=15)，依此类推。

我想要的输出由 x 列表示，它应该是这样的。

Country1	Country2	Weight	obs	x
Germany	Germany	4	1	123
Germany	Germany	119	2	246
France	Germany	3	3	15
France	Germany	2	4	20
Italy	France	1	5	5

我尝试应用的公式也是这个。

Answer 1

试试这个：

library(dplyr)
#Code
new <- df %>% group_by(Country1) %>%
  mutate(x=sum(Weight)*obs)

输出：

# A tibble: 5 x 5
# Groups:   Country1 [3]
  Country1 Country2 Weight   obs     x
  <chr>    <chr>     <int> <int> <int>
1 Germany  Germany       4     1   123
2 Germany  Germany     119     2   246
3 France   Germany       3     3    15
4 France   Germany       2     4    20
5 Italy    France        1     5     5

使用了一些数据：

#Data
df <- structure(list(Country1 = c("Germany", "Germany", "France", "France", 
"Italy"), Country2 = c("Germany", "Germany", "Germany", "Germany", 
"France"), Weight = c(4L, 119L, 3L, 2L, 1L), obs = 1:5), class = "data.frame", row.names = c(NA, 
-5L))

Answer 2

我们可以使用data.table方法

library(data.table)
setDT(df1)[, x := sum(Weight) *obs, by = Country1][]

-输出

#   Country1 Country2 Weight obs   x
#1:  Germany  Germany      4   1 123
#2:  Germany  Germany    119   2 246
#3:   France  Germany      3   3  15
#4:   France  Germany      2   4  20
#5:    Italy   France      1   5   5

或使用 base R 和 ave

df1$x <- with(df1, ave(Weight, Country1, FUN = sum) * obs)

数据

df1 <- structure(list(Country1 = c("Germany", "Germany", "France", "France", 
"Italy"), Country2 = c("Germany", "Germany", "Germany", "Germany", 
"France"), Weight = c(4L, 119L, 3L, 2L, 1L), obs = 1:5),
class = "data.frame", row.names = c(NA, 
-5L))

Answer 3

你也可以这样解决：

df1$x <- tapply(df1$Weight, df1$Country1, sum)[df1$Country1] * df1$obs

  Country1 Country2 Weight obs   x
1  Germany  Germany      4   1 123
2  Germany  Germany    119   2 246
3   France  Germany      3   3  15
4   France  Germany      2   4  20
5    Italy   France      1   5   5

如何进行汇总，然后将结果乘以组？

How do I make a summary and then multiply the result by group?

r

sum

multiplication

weighted

数据