带 dcast 的宽格式 data.table

wide format with dcast data.table

我想将 table 转换成这样 (*):

set.seed(1)
mydata <- data.frame(ID=rep(1:4, each=3), R=rep(1:3, times=4), FIXED=rep(runif(4), each=3), AAA=rnorm(12), BBB=rbinom(12,12,0.5), CCC=runif(12))

ID R    FIXED    AAA  BBB   CCC
 1 1    0.26   -0.83   8   0.82
 1 2    0.26    1.59   5   0.64
 1 3    0.26    0.32   6   0.78
 2 1    0.37   -0.82   6   0.55
 2 2    0.37    0.48   6   0.52
 2 3    0.37    0.73   4   0.78
 3 1    0.57    0.57   8   0.02
 3 2    0.57   -0.30   7   0.47
 3 3    0.57    1.51   7   0.73
 4 1    0.90    0.38   4   0.69
 4 2    0.90   -0.62   7   0.47
 4 3    0.90   -2.21   6   0.86    

进入宽幅面,像这样:

ID  FIXED   AAA1    BBB2    CCC2    FIXED2  AAA2    BBB2    CCC2    FIXED3  AAA3    BBB3    CCC3
1   0.27    0.49       7    0.73     0.37   0.74       4    0.69      0.57  0.58       7    0.48
2   0.91    -0.31      6    0.86     0.20   1.51       8    0.44      0.90  0.39       7    0.24
3   0.94    -0.62      7    0.07     0.66  -2.21       6    0.10      0.63  1.12       6    0.32
4   0.06    -0.04      7    0.52     0.21  -0.02       3    0.66      0.18  0.94       6    0.41

我该怎么做?
我试过

dcast(mydata, ID + FIXED ~ R, value.var=(names(mydata)[3:5])   

甚至写列名,"AAA"、"BBB"、"CCC",但它会产生错误,我无法获得我需要的宽格式。我也尝试过其他选择,但没有运气。

我该怎么做?

(*)现实中栏目多了很多,但故事是一样的

错误是:

Error in .subset2(x, i, exact = exact) : 
  recursive indexing failed at level 2
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
  the condition has length > 1 and only the first element will be used
set.seed(1)
require(data.table)
mydata <- data.table(ID=rep(1:4, each=3), R=rep(1:3, times=4), FIXED=rep(runif(4), each=3), AAA=rnorm(12), BBB=rbinom(12,12,0.5), CCC=runif(12))
dcast(mydata, ID ~ R, value.var=names(mydata)[3:6])
   ID    FIXED_1    FIXED_2    FIXED_3      AAA_1      AAA_2       AAA_3 BBB_1 BBB_2 BBB_3     CCC_1     CCC_2     CCC_3
1:  1 0.43809711 0.43809711 0.43809711 -0.4781501  0.4179416  1.35867955     6     7     6 0.6422883 0.8762692 0.7789147
2:  2 0.24479728 0.24479728 0.24479728 -0.1027877  0.3876716 -0.05380504     5     7     5 0.7973088 0.4552745 0.4100841
3:  3 0.07067905 0.07067905 0.07067905 -1.3770596 -0.4149946 -0.39428995     7     4     5 0.8108702 0.6049333 0.6547239
4:  4 0.09946616 0.09946616 0.09946616 -0.0593134  1.1000254  0.76317575     4     5     3 0.3531973 0.2702601 0.9926841

您引用了错误的值变量(AAABBBCCC 列的索引号为 4 - 6),您应该使用 setDT()将数据框转换为数据表。使用:

dcast(setDT(mydata), ID + FIXED ~ R, value.var = names(mydata)[4:6])

给出:

   ID     FIXED      AAA_1      AAA_2      AAA_3 BBB_1 BBB_2 BBB_3     CCC_1     CCC_2     CCC_3
1:  1 0.2655087 -0.8356286  1.5952808  0.3295078     8     5     6 0.8209463 0.6470602 0.7829328
2:  2 0.3721239 -0.8204684  0.4874291  0.7383247     6     6     4 0.5530363 0.5297196 0.7893562
3:  3 0.5728534  0.5757814 -0.3053884  1.5117812     8     7     7 0.0233312 0.4772301 0.7323137
4:  4 0.9082078  0.3898432 -0.6212406 -2.2146999     4     7     6 0.6927316 0.4776196 0.8612095

如果您不转换为数据表,data.table 程序包将从 reshape2 回退到 dcast 的实现,后者无法处理多个 [=19] =]'s,因此出现错误消息。

如果你想要另一个分隔符,你可以添加例如 sep = '.' 参数到 dcast.