如何更有效地将我的数据框重塑为新形式 (R)?

How to more efficiently reshape my dataframe into a new form (R)?

我有这样的数据集 (df1)

ID  2   4   6   8   10  12  14  16  18  20  22  24   Day
1   0   0   0   0   2   0   0   0   1   0   1   0    Sunday
1   0   0   0   0   0   4   0   0   0   0   0   0   Monday
1   0   0   0   0   0   0   0   0   2   0   0   0   Tuesday
1   0   0   0   0   0   0   2   0   0   0   0   0   Wednesday
1   0   0   0   0   0   0   0   2   0   0   0   0   Thursday
1   0   0   0   0   0   0   0   0   2   0   0   0   Friday
1   0   0   0   0   0   0   0   0   0   2   0   0   Saturday
2   0   0   0   0   0   0   0   0   0   0   0   0   Sunday
2   0   0   0   0   0   1   0   0   0   0   0   0   Monday
2   0   0   0   0   0   0   1   0   0   0   1   0   Tuesday
2   0   0   0   0   0   0   0   1   0   0   0   0   Wednesday
2   0   0   0   0   0   0   0   0   1   0   0   0   Thursday
2   0   0   0   0   0   2   0   0   0   1   0   0   Friday
2   0   0   0   0   0   0   0   0   0   0   0   0   Saturday
3   0   0   0   0   0   0   0   0   0   0   0   0   Sunday
3   0   0   0   0   0   0   2   0   0   0   0   0   Monday
3   0   0   0   0   0   1   0   0   2   0   0   0   Tuesday
3   0   0   0   0   0   0   0   0   0   0   0   0   Wednesday
3   0   0   0   0   0   0   0   2   0   0   0   0   Thursday
3   0   0   0   0   0   0   0   0   0   0   0   0   Friday
3   0   0   0   0   0   0   2   0   0   0   0   0   Saturday
3   0   0   0   0   0   0   0   2   0   0   0   0   Sunday

我有一个这样的 ID 清单:

ID
1
2
3

我想将 df1 转换成这种输出:

ID  Var1    Var2    Var3    Var4    Var5 ...... Var82   Var83 Var84
1   0         0      0         0     2             2      0     0
2
3

其中 Var1 代表 'Sunday 2'(在第一个数据帧中),var84 代表 'Saturday24'。我想将结果导出为 .csv 文件。

我使用 for 循环(如下所示)来执行此操作,因为 ID 太多了。但是,问题是这些代码 运行 非常慢。有没有更快的方法来获得相同的结果?

library(dplyr)
library(reshape2)
for (i in ID_checklist$ID) {

  x= filter(df1$ID %in% i)
  x$Day = NULL
  df.melted = melt(t(x[,-1]), id.vars = NULL)
  myNewDF = data.frame(i, t(df.melted[,3]))
  write.table(myNewDF,file="my12x7.csv", append=TRUE,sep=",",col.names=FALSE,row.names=FALSE)
}

我想这就是你想要的:

library(reshape2)

# this may be unnecessary depending on your data
# it will make sure the weekday columns come in the same order
# as the weekdays appear in your original data
df1$Day = factor(df1$Day, levels = unique(df1$Day))

# convert to a fully long format
df_long = melt(df1, id.var = c("ID", "Day"))

# convert to the wide format you want
result = dcast(data = df_long, ID ~ Day + variable, fun.aggregate = sum)

这会将日期名称附加到当前变量。如果您希望将它们设为 Var1 Var2 Var3,请使用 paste() 并重命名列。

我们可以看前几列来验证:

result[, 1:6]
#   ID Sunday_X2 Sunday_X4 Sunday_X6 Sunday_X8 Sunday_X10
# 1  1         0         0         0         0          2
# 2  2         0         0         0         0          0
# 3  3         0         0         0         0          0