如何在 R 中的范围内投射聚合值并用零填充缺失的范围值

How to cast aggregated values over a range in R and fill missing range values with zero

如何在 R 中的范围内转换聚合值并用零填充缺失的范围值。

df <- data.frame (year = sample(c(2014:2016), 100, replace=T),
                  month = sample(c(1:5,8:12), 100, replace=T),
                  int = 1)

# install.packages("reshape")
library(reshape)
month <- cast(df, year ~ month, sum, value = 'int')
month

输出:

# output
  year 1 2 3 4 5 8 9 10 11 12
1 2014 6 5 4 3 4 4 3  3  9  2
2 2015 4 9 1 3 1 4 3  3  2  3
3 2016 0 3 3 4 4 1 4  1  3  1

如何将缺失的月份设置为零?结果应该是这样的:

# output
  year 1 2 3 4 5 >6< >7< 8 9 10 11 12
1 2014 6 5 4 3 4  0   0  4 3  3  9  2
2 2015 4 9 1 3 1  0   0  4 3  3  2  3
3 2016 0 3 3 4 4  0   0  1 4  1  3  1

有没有办法通过 cast 函数做到这一点?

我们可以使用tidyverse将'month'转换为factor,其中levels指定为1:12,得到sum =31=] 按 'year'、'month' 和 spread 分组为 'wide' 格式 drop=FALSE

library(tidyverse)
df %>%
   group_by(year, month = factor(month, levels = 1:12)) %>% 
   summarise(int = sum(int)) %>% 
   spread(month, int, drop = FALSE, fill = 0) 
#     year   `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`  `11`  `12`
#* <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1  2014     3     2     2     1     2     0     0     4     1     5     5     6
#2  2015     2     7     5     2     4     0     0     5     3     3     4     5
#3  2016     0     4     5     5     2     0     0     3     2     1     5     2

或在一行中使用dcast

library(data.table)
dcast(setDT(df), year ~ factor(month, levels = 1:12), sum, drop = FALSE)
#   year 1 2 3 4 5 6 7 8 9 10 11 12
#1: 2014 3 2 2 1 2 0 0 4 1  5  5  6
#2: 2015 2 7 5 2 4 0 0 5 3  3  4  5
#3: 2016 0 4 5 5 2 0 0 3 2  1  5  2

xtabs 来自 base R

xtabs(int~year+factor(month, levels = 1:12), df)