R基于列值聚合数据框

R Aggregate data frame based on column values

我有一个如下所示的数据集:

> newex
        Name Volume Period
1        oil  29000 Jun 21
2       gold    800 Mar 22
3        oil  21000 Jul 21
4       gold   1100 Sep 21
5       gold   3000 Feb 21
6    depower      3  Q1 21
7        oil  23000 Apr 22
8    czpower     26  Q1 23
9        oil  17000  Q1 21
10      gold   2400 May 21
11       oil  12000  Q2 21
12      gold   1800 Jan 22
13   czpower     21 Oct 21
14  api2coal   6000  Q1 22
15  api2coal  11000  Q1 21
16   depower     11 Jan 22
17  api2coal  16000 Jul 21
18      gold   1300 Mar 21
19   depower      3  Q1 22
20       oil  17000 Cal 21

我想重塑数据集以获得具有以下特征的数据框:

有人可以给我一个关于如何实现这一点的提示吗?提前谢谢你。

我们可以使用 pivot_wider:

library(dplyr)
library(tidyr)

newex %>% 
  pivot_wider(
    names_from = Name,
    values_from = Volume
  )
   Period   oil  gold depower czpower api2coal
   <chr>  <int> <int>   <int>   <int>    <int>
 1 Jun 21 29000    NA      NA      NA       NA
 2 Mar 22    NA   800      NA      NA       NA
 3 Jul 21 21000    NA      NA      NA    16000
 4 Sep 21    NA  1100      NA      NA       NA
 5 Feb 21    NA  3000      NA      NA       NA
 6 Q1  21 17000    NA       3      NA    11000
 7 Apr 22 23000    NA      NA      NA       NA
 8 Q1  23    NA    NA      NA      26       NA
 9 May 21    NA  2400      NA      NA       NA
10 Q2  21 12000    NA      NA      NA       NA
11 Jan 22    NA  1800      11      NA       NA
12 Oct 21    NA    NA      NA      21       NA
13 Q1  22    NA    NA       3      NA     6000
14 Mar 21    NA  1300      NA      NA       NA
15 Cal 21 17000    NA      NA      NA       NA

数据:

newex <- structure(list(Name = c("oil", "gold", "oil", "gold", "gold", 
"depower", "oil", "czpower", "oil", "gold", "oil", "gold", "czpower", 
"api2coal", "api2coal", "depower", "api2coal", "gold", "depower", 
"oil"), Volume = c(29000L, 800L, 21000L, 1100L, 3000L, 3L, 23000L, 
26L, 17000L, 2400L, 12000L, 1800L, 21L, 6000L, 11000L, 11L, 16000L, 
1300L, 3L, 17000L), Period = c("Jun 21", "Mar 22", "Jul 21", 
"Sep 21", "Feb 21", "Q1  21", "Apr 22", "Q1  23", "Q1  21", "May 21", 
"Q2  21", "Jan 22", "Oct 21", "Q1  22", "Q1  21", "Jan 22", "Jul 21", 
"Mar 21", "Q1  22", "Cal 21")), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20"))

我在尝试之前的答案后找到了答案:

newex %>% 
 pivot_wider(
     names_from = Name,
     values_from = Volume, values_fn = sum)
library(data.table)
dcast(setDT(newex),Period~Name, value.var="Volume",fun.aggregate = sum)

输出:

    Period api2coal czpower depower gold   oil
 1: Apr 22        0       0       0    0 23000
 2: Cal 21        0       0       0    0 17000
 3: Feb 21        0       0       0 3000     0
 4: Jan 22        0       0      11 1800     0
 5: Jul 21    16000       0       0    0 21000
 6: Jun 21        0       0       0    0 29000
 7: Mar 21        0       0       0 1300     0
 8: Mar 22        0       0       0  800     0
 9: May 21        0       0       0 2400     0
10: Oct 21        0      21       0    0     0
11: Q1  21    11000       0       3    0 17000
12: Q1  22     6000       0       3    0     0
13: Q1  23        0      26       0    0     0
14: Q2  21        0       0       0    0 12000
15: Sep 21        0       0       0 1100     0