如何在 R 中的范围内投射聚合值并用零填充缺失的范围值
How to cast aggregated values over a range in R and fill missing range values with zero
如何在 R 中的范围内转换聚合值并用零填充缺失的范围值。
df <- data.frame (year = sample(c(2014:2016), 100, replace=T),
month = sample(c(1:5,8:12), 100, replace=T),
int = 1)
# install.packages("reshape")
library(reshape)
month <- cast(df, year ~ month, sum, value = 'int')
month
输出:
# output
year 1 2 3 4 5 8 9 10 11 12
1 2014 6 5 4 3 4 4 3 3 9 2
2 2015 4 9 1 3 1 4 3 3 2 3
3 2016 0 3 3 4 4 1 4 1 3 1
如何将缺失的月份设置为零?结果应该是这样的:
# output
year 1 2 3 4 5 >6< >7< 8 9 10 11 12
1 2014 6 5 4 3 4 0 0 4 3 3 9 2
2 2015 4 9 1 3 1 0 0 4 3 3 2 3
3 2016 0 3 3 4 4 0 0 1 4 1 3 1
有没有办法通过 cast 函数做到这一点?
我们可以使用tidyverse
将'month'转换为factor
,其中levels
指定为1:12,得到sum
=31=] 按 'year'、'month' 和 spread
分组为 'wide' 格式 drop=FALSE
library(tidyverse)
df %>%
group_by(year, month = factor(month, levels = 1:12)) %>%
summarise(int = sum(int)) %>%
spread(month, int, drop = FALSE, fill = 0)
# year `1` `2` `3` `4` `5` `6` `7` `8` `9` `10` `11` `12`
#* <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 2014 3 2 2 1 2 0 0 4 1 5 5 6
#2 2015 2 7 5 2 4 0 0 5 3 3 4 5
#3 2016 0 4 5 5 2 0 0 3 2 1 5 2
或在一行中使用dcast
library(data.table)
dcast(setDT(df), year ~ factor(month, levels = 1:12), sum, drop = FALSE)
# year 1 2 3 4 5 6 7 8 9 10 11 12
#1: 2014 3 2 2 1 2 0 0 4 1 5 5 6
#2: 2015 2 7 5 2 4 0 0 5 3 3 4 5
#3: 2016 0 4 5 5 2 0 0 3 2 1 5 2
或 xtabs
来自 base R
xtabs(int~year+factor(month, levels = 1:12), df)
如何在 R 中的范围内转换聚合值并用零填充缺失的范围值。
df <- data.frame (year = sample(c(2014:2016), 100, replace=T),
month = sample(c(1:5,8:12), 100, replace=T),
int = 1)
# install.packages("reshape")
library(reshape)
month <- cast(df, year ~ month, sum, value = 'int')
month
输出:
# output
year 1 2 3 4 5 8 9 10 11 12
1 2014 6 5 4 3 4 4 3 3 9 2
2 2015 4 9 1 3 1 4 3 3 2 3
3 2016 0 3 3 4 4 1 4 1 3 1
如何将缺失的月份设置为零?结果应该是这样的:
# output
year 1 2 3 4 5 >6< >7< 8 9 10 11 12
1 2014 6 5 4 3 4 0 0 4 3 3 9 2
2 2015 4 9 1 3 1 0 0 4 3 3 2 3
3 2016 0 3 3 4 4 0 0 1 4 1 3 1
有没有办法通过 cast 函数做到这一点?
我们可以使用tidyverse
将'month'转换为factor
,其中levels
指定为1:12,得到sum
=31=] 按 'year'、'month' 和 spread
分组为 'wide' 格式 drop=FALSE
library(tidyverse)
df %>%
group_by(year, month = factor(month, levels = 1:12)) %>%
summarise(int = sum(int)) %>%
spread(month, int, drop = FALSE, fill = 0)
# year `1` `2` `3` `4` `5` `6` `7` `8` `9` `10` `11` `12`
#* <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 2014 3 2 2 1 2 0 0 4 1 5 5 6
#2 2015 2 7 5 2 4 0 0 5 3 3 4 5
#3 2016 0 4 5 5 2 0 0 3 2 1 5 2
或在一行中使用dcast
library(data.table)
dcast(setDT(df), year ~ factor(month, levels = 1:12), sum, drop = FALSE)
# year 1 2 3 4 5 6 7 8 9 10 11 12
#1: 2014 3 2 2 1 2 0 0 4 1 5 5 6
#2: 2015 2 7 5 2 4 0 0 5 3 3 4 5
#3: 2016 0 4 5 5 2 0 0 3 2 1 5 2
或 xtabs
来自 base R
xtabs(int~year+factor(month, levels = 1:12), df)