计算平均差值
Calculate mean difference values
我有以下 data.table
:-
set.seed(1234332)
kks <- data.table(name = c("a", "a", "b", "b", "b", "d", "d", "d", "e", "f", "f", "f", "f"),
year = c(2012, 2013, 2011, 2012, 2013, 2011, 2012, 2013, 2014, 2011, 2012, 2013, 2014),
loc = c(1, 1, 1, 1, 1, 2, 2, 2, 3, 2, 2, 2, 2),
value1 = runif(13),
value2 = rnorm(13),
value3 = runif(13))
我想计算变量 value1
、value2
和 value3
的平均差值。但我希望根据列 loc
而不是 name
计算平均值。因此,应根据相同的loc
值计算平均值。
我或许可以使用 for 循环来完成此操作。但我想知道使用 data.table
.
是否有更简单的方法
提前致谢。
正如评论中所指出的,我不确定“根据 loc
列”是什么意思。如果您指的是 by
组,那么我会执行以下操作(也不确定,您是否要保留旧列):
library(data.table)
col_nms <- paste0("value", 1:3)
new_col_nms <- paste0("diff_", col_nms)
kks[,
(new_col_nms) := lapply(.SD, function(x) x - mean(x)),
by = loc,
.SDcols = col_nms][]
#> name year loc value1 value2 value3 diff_value1 diff_value2
#> 1: a 2012 1 0.5525851 -1.216676578 0.40580256 -0.18364096 -1.49987982
#> 2: a 2013 1 0.4099640 0.574428779 0.36695270 -0.32626208 0.29122553
#> 3: b 2011 1 0.9936533 2.104311096 0.81294598 0.25742724 1.82110785
#> 4: b 2012 1 0.8424899 -0.036993187 0.56436515 0.10626381 -0.32019643
#> 5: b 2013 1 0.8824381 -0.009053885 0.62298286 0.14621198 -0.29225713
#> 6: d 2011 2 0.1191904 -1.072735734 0.63039403 -0.39648257 -0.67081978
#> 7: d 2012 2 0.7608226 0.229140684 0.34979251 0.24514966 0.63105663
#> 8: d 2013 2 0.6963069 -0.570565439 0.59228905 0.18063397 -0.16864949
#> 9: e 2014 3 0.8863696 0.879948680 0.44716966 0.00000000 0.00000000
#> 10: f 2011 2 0.5611566 -0.684572180 0.05950426 0.04548364 -0.28265623
#> 11: f 2012 2 0.9437519 -0.499744731 0.10843697 0.42807896 -0.09782878
#> 12: f 2013 2 0.4148588 0.541616469 0.72578654 -0.10081415 0.94353242
#> 13: f 2014 2 0.1136235 -0.756550716 0.94052853 -0.40204951 -0.35463477
#> diff_value3
#> 1: -0.148807293
#> 2: -0.187657152
#> 3: 0.258336131
#> 4: 0.009755303
#> 5: 0.068373010
#> 6: 0.143718043
#> 7: -0.136883476
#> 8: 0.105613065
#> 9: 0.000000000
#> 10: -0.427171722
#> 11: -0.378239014
#> 12: 0.239110555
#> 13: 0.453852549
由 reprex package (v2.0.1)
于 2022-04-07 创建
我有以下 data.table
:-
set.seed(1234332)
kks <- data.table(name = c("a", "a", "b", "b", "b", "d", "d", "d", "e", "f", "f", "f", "f"),
year = c(2012, 2013, 2011, 2012, 2013, 2011, 2012, 2013, 2014, 2011, 2012, 2013, 2014),
loc = c(1, 1, 1, 1, 1, 2, 2, 2, 3, 2, 2, 2, 2),
value1 = runif(13),
value2 = rnorm(13),
value3 = runif(13))
我想计算变量 value1
、value2
和 value3
的平均差值。但我希望根据列 loc
而不是 name
计算平均值。因此,应根据相同的loc
值计算平均值。
我或许可以使用 for 循环来完成此操作。但我想知道使用 data.table
.
提前致谢。
正如评论中所指出的,我不确定“根据 loc
列”是什么意思。如果您指的是 by
组,那么我会执行以下操作(也不确定,您是否要保留旧列):
library(data.table)
col_nms <- paste0("value", 1:3)
new_col_nms <- paste0("diff_", col_nms)
kks[,
(new_col_nms) := lapply(.SD, function(x) x - mean(x)),
by = loc,
.SDcols = col_nms][]
#> name year loc value1 value2 value3 diff_value1 diff_value2
#> 1: a 2012 1 0.5525851 -1.216676578 0.40580256 -0.18364096 -1.49987982
#> 2: a 2013 1 0.4099640 0.574428779 0.36695270 -0.32626208 0.29122553
#> 3: b 2011 1 0.9936533 2.104311096 0.81294598 0.25742724 1.82110785
#> 4: b 2012 1 0.8424899 -0.036993187 0.56436515 0.10626381 -0.32019643
#> 5: b 2013 1 0.8824381 -0.009053885 0.62298286 0.14621198 -0.29225713
#> 6: d 2011 2 0.1191904 -1.072735734 0.63039403 -0.39648257 -0.67081978
#> 7: d 2012 2 0.7608226 0.229140684 0.34979251 0.24514966 0.63105663
#> 8: d 2013 2 0.6963069 -0.570565439 0.59228905 0.18063397 -0.16864949
#> 9: e 2014 3 0.8863696 0.879948680 0.44716966 0.00000000 0.00000000
#> 10: f 2011 2 0.5611566 -0.684572180 0.05950426 0.04548364 -0.28265623
#> 11: f 2012 2 0.9437519 -0.499744731 0.10843697 0.42807896 -0.09782878
#> 12: f 2013 2 0.4148588 0.541616469 0.72578654 -0.10081415 0.94353242
#> 13: f 2014 2 0.1136235 -0.756550716 0.94052853 -0.40204951 -0.35463477
#> diff_value3
#> 1: -0.148807293
#> 2: -0.187657152
#> 3: 0.258336131
#> 4: 0.009755303
#> 5: 0.068373010
#> 6: 0.143718043
#> 7: -0.136883476
#> 8: 0.105613065
#> 9: 0.000000000
#> 10: -0.427171722
#> 11: -0.378239014
#> 12: 0.239110555
#> 13: 0.453852549
由 reprex package (v2.0.1)
于 2022-04-07 创建