使用聚合的加权平均值

Question

很抱歉问了一个非常基本的问题，但我陷入了一个难题，似乎无法摆脱它。

我的代码看起来像

Medicine  Biology  Business sex weights
0           1          0     1     0.5
0           0          1     0     1
1           0          0     1     05
0           1          0     0     0.33
0           0          1     0     0.33
1           0          0     1     1 
0           1          0     0     0.33
0           0          1     1     1
1           0          0     1     1

其中前三个是研究领域，第四个变量是关于性别的。显然还有更多的观察。我想要得到的是研究领域（医学、生物学、商业）的平均水平，按性别分类（因此男性的平均值和女性的平均值）。为此，我使用了以下代码：

barplot_sex<-aggregate(x=df_dummies[,1:19] , by=list(df$sex),
                            FUN= function(x) mean(x)

效果很好，满足了我的需求。我的问题是我现在需要使用加权平均值，但我不能使用

FUN= function(x) weighted.mean(x, weights)

因为观察比研究领域多得多。

我设法做的唯一选择是编辑（箱线图）并手动更改值，但 R 不会保存更改。另外，我确信一定有一种简单的方法可以完全满足我的需要。

如有任何帮助，我们将不胜感激。

最佳，加布里埃尔

Answer 1

使用 by.

by(dat, dat$sex, function(x) sapply(x[, 1:3], weighted.mean, x[, "weights"]))
# dat$sex: 0
# Medicine   Biology  Business 
# 0.0000000 0.3316583 0.6683417 
# --------------------------------------------------------------------------------------- 
# dat$sex: 1
# Medicine    Biology   Business 
# 0.82352941 0.05882353 0.11764706

数据：

dat <- structure(list(Medicine = c(0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L
), Biology = c(1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L), Business = c(0L, 
1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L), sex = c(1L, 0L, 1L, 0L, 0L, 
1L, 0L, 1L, 1L), weights = c(0.5, 1, 5, 0.33, 0.33, 1, 0.33, 
1, 1)), class = "data.frame", row.names = c(NA, -9L))

使用聚合的加权平均值

Weighted mean using aggregated

aggregate

r

weighted-average