按 id 表示的组偏差
Deviations from group mean by id
我有一些时间嵌套在个体中的数据:
set.seed(124)
x = rnorm(25)
data.frame(id=rep(1:5, each=5), time=1:5, x=x)
附加一列计算每个观察值与同一个人随时间的平均值的偏差(即以该人的平均值为中心)的基本 R 解决方案是什么?输出应如下所示(x.c 是附加的列,用于计算与人均值的偏差):
id time x x.c
1 1 1 -1.38507062 3.814056e-07
2 1 2 0.03832318 1.423394e+00
3 1 3 -0.76303016 6.220408e-01
4 1 4 0.21230614 1.597377e+00
5 1 5 1.42553797 2.810609e+00
6 2 1 0.74447982 2.233398e-08
7 2 2 0.70022940 -4.425040e-02
8 2 3 -0.22935461 -9.738344e-01
9 2 4 0.19709386 -5.473859e-01
10 2 5 1.20715377 4.626740e-01
11 3 1 0.31833673 2.642477e-08
12 3 2 -1.42379885 -1.742136e+00
13 3 3 -0.40509086 -7.234276e-01
14 3 4 0.99538657 6.770499e-01
15 3 5 0.95881779 6.404811e-01
16 4 1 0.91808790 -3.680049e-09
17 4 2 -0.15096960 -1.069058e+00
18 4 3 -1.22306879 -2.141157e+00
19 4 4 -0.86882429 -1.786912e+00
20 4 5 -1.04248536 -1.960573e+00
21 5 1 -1.10363778 2.169331e-07
22 5 2 0.44418506 1.547823e+00
23 5 3 -0.20495061 8.986874e-01
24 5 4 1.67563243 2.779270e+00
25 5 5 -0.13132225 9.723158e-01
我知道 tidyverse
解决方案是 group_by
但我想要一个基本的 R 解决方案。谢谢!
A base R
解决方案是通过 'id' 和 ave
获得平均值,并从 'x'
的个体观察中减去
df1$x.c <- with(df1, x - ave(x, id))
这是使用 aggregate
的替代基础 R 方法:
df1 <- merge(df, aggregate(x ~ id, data = df, mean),
by = "id", suffixes = c("", "mean"))
df1$x.c <- df1$x - df1$xmean
df1[-4]
id time x x.c
1 1 1 -1.38507062 -1.2906839
2 1 2 0.03832318 0.1327099
3 1 3 -0.76303016 -0.6686435
4 1 4 0.21230614 0.3066928
5 1 5 1.42553797 1.5199247
6 2 1 0.74447982 0.2205594
7 2 2 0.70022940 0.1763090
8 2 3 -0.22935461 -0.7532751
9 2 4 0.19709386 -0.3268266
10 2 5 1.20715377 0.6832333
11 3 1 0.31833673 0.2296065
12 3 2 -1.42379885 -1.5125291
13 3 3 -0.40509086 -0.4938211
14 3 4 0.99538657 0.9066563
15 3 5 0.95881779 0.8700875
16 4 1 0.91808790 1.3915399
17 4 2 -0.15096960 0.3224824
18 4 3 -1.22306879 -0.7496168
19 4 4 -0.86882429 -0.3953723
20 4 5 -1.04248536 -0.5690333
21 5 1 -1.10363778 -1.2396192
22 5 2 0.44418506 0.3082037
23 5 3 -0.20495061 -0.3409320
24 5 4 1.67563243 1.5396511
25 5 5 -0.13132225 -0.2673036
我有一些时间嵌套在个体中的数据:
set.seed(124)
x = rnorm(25)
data.frame(id=rep(1:5, each=5), time=1:5, x=x)
附加一列计算每个观察值与同一个人随时间的平均值的偏差(即以该人的平均值为中心)的基本 R 解决方案是什么?输出应如下所示(x.c 是附加的列,用于计算与人均值的偏差):
id time x x.c
1 1 1 -1.38507062 3.814056e-07
2 1 2 0.03832318 1.423394e+00
3 1 3 -0.76303016 6.220408e-01
4 1 4 0.21230614 1.597377e+00
5 1 5 1.42553797 2.810609e+00
6 2 1 0.74447982 2.233398e-08
7 2 2 0.70022940 -4.425040e-02
8 2 3 -0.22935461 -9.738344e-01
9 2 4 0.19709386 -5.473859e-01
10 2 5 1.20715377 4.626740e-01
11 3 1 0.31833673 2.642477e-08
12 3 2 -1.42379885 -1.742136e+00
13 3 3 -0.40509086 -7.234276e-01
14 3 4 0.99538657 6.770499e-01
15 3 5 0.95881779 6.404811e-01
16 4 1 0.91808790 -3.680049e-09
17 4 2 -0.15096960 -1.069058e+00
18 4 3 -1.22306879 -2.141157e+00
19 4 4 -0.86882429 -1.786912e+00
20 4 5 -1.04248536 -1.960573e+00
21 5 1 -1.10363778 2.169331e-07
22 5 2 0.44418506 1.547823e+00
23 5 3 -0.20495061 8.986874e-01
24 5 4 1.67563243 2.779270e+00
25 5 5 -0.13132225 9.723158e-01
我知道 tidyverse
解决方案是 group_by
但我想要一个基本的 R 解决方案。谢谢!
A base R
解决方案是通过 'id' 和 ave
获得平均值,并从 'x'
df1$x.c <- with(df1, x - ave(x, id))
这是使用 aggregate
的替代基础 R 方法:
df1 <- merge(df, aggregate(x ~ id, data = df, mean),
by = "id", suffixes = c("", "mean"))
df1$x.c <- df1$x - df1$xmean
df1[-4]
id time x x.c
1 1 1 -1.38507062 -1.2906839
2 1 2 0.03832318 0.1327099
3 1 3 -0.76303016 -0.6686435
4 1 4 0.21230614 0.3066928
5 1 5 1.42553797 1.5199247
6 2 1 0.74447982 0.2205594
7 2 2 0.70022940 0.1763090
8 2 3 -0.22935461 -0.7532751
9 2 4 0.19709386 -0.3268266
10 2 5 1.20715377 0.6832333
11 3 1 0.31833673 0.2296065
12 3 2 -1.42379885 -1.5125291
13 3 3 -0.40509086 -0.4938211
14 3 4 0.99538657 0.9066563
15 3 5 0.95881779 0.8700875
16 4 1 0.91808790 1.3915399
17 4 2 -0.15096960 0.3224824
18 4 3 -1.22306879 -0.7496168
19 4 4 -0.86882429 -0.3953723
20 4 5 -1.04248536 -0.5690333
21 5 1 -1.10363778 -1.2396192
22 5 2 0.44418506 0.3082037
23 5 3 -0.20495061 -0.3409320
24 5 4 1.67563243 1.5396511
25 5 5 -0.13132225 -0.2673036