使用 amount 变量和几个 ID 变量有效地重塑长到宽

Reshape long to wide efficiently with amount variable and several ID variables

我的数据类似于下面第一张图表的大得多的版本。我想将它“解开”到第二张图表中,但我无法有效地做到这一点。在底部,我有我最近的尝试,其中 IDVars 基本上是下面的前三列。它 运行 持续了 15 分钟,然后我才需要杀死它。

Name ID Trial Variable Amount
Name 1 1 1 FinalSalary 300.00
Name 1 1 1 FinalDCBalance 400.00
Name 1 1 2 FinalSalary 300.00
Name 1 1 2 FinalDCBalance 300.00
Name 2 2 1 FinalSalary 400.00
Name 2 2 1 FinalDCBalance 400.00
Name 2 2 2 FinalSalary 200.00
Name 2 2 2 FinalDCBalance 300.00
Name 3 3 1 FinalSalary 100.00
Name 3 3 2 FinalDCBalance 400.00
Name ID Trial FinalSalary FinalDCBalance
Name 1 1 1 300 400
Name 1 1 2 300 300
Name 2 2 1 400 400
Name 2 2 2 200 300
Name 3 3 1 100 400
Name 3 3 2 300 100
unmelt <- reshape(dataframe, idvar = IDVars, v.names = 'variable', direction = 'wide', timevar = 'Amount')

我们可以使用pivot_wider

library(tidyr)
pivot_wider(df1, names_from = 'Variable', values_from = 'Amount')

timevar= 应该是 "Variable",而不是 "Amount"。 idvar 列在旁边,timevar 列在顶部,其他所有内容(金额)作为值进入输出正文。 v.names = "Amount" 可以指定,但它会计算出来,因为这是唯一剩下的列,所以我们省略了它。

r <- reshape(dd, dir = "wide", idvar = c("Name", "ID", "Trial"), timevar = "Variable")
names(r) <- sub("Amount.", "", names(r)) # optional

给予:

> r
     Name ID Trial FinalSalary FinalDCBalance
1  Name 1  1     1         300            400
3  Name 1  1     2         300            300
5  Name 2  2     1         400            400
7  Name 2  2     2         200            300
9  Name 3  3     1         100             NA
10 Name 3  3     2          NA            400

备注

可重现形式的输入:

dd <- structure(list(Name = c("Name 1", "Name 1", "Name 1", "Name 1", 
"Name 2", "Name 2", "Name 2", "Name 2", "Name 3", "Name 3"), 
    ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L), Trial = c(1L, 
    1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L), Variable = c("FinalSalary", 
    "FinalDCBalance", "FinalSalary", "FinalDCBalance", "FinalSalary", 
    "FinalDCBalance", "FinalSalary", "FinalDCBalance", "FinalSalary", 
    "FinalDCBalance"), Amount = c(300, 400, 300, 300, 400, 400, 
    200, 300, 100, 400)), class = "data.frame", row.names = c(NA, 
-10L))