如何使用多个变量制作滚动总和(或滚动平均值)
how to make a rolling sum (or rolling average) with multiple variables
我是 zoo 包的新手,所以这可能是一个简单的问题。
我有以下数据框 (df):
library(lubridate)
library(zoo)
library(dplyr)
Date <- c("2010-01-28", "2010-01-28", "2010-02-28",
"2010-02-28", "2010-03-28", "2010-03-28",
"2010-04-28", "2010-04-28")
Date <- as_date(Date)
Amount <- 1:8
Prod <- c("Corn", "Potato","Corn", "Potato","Corn", "Potato","Corn", "Potato")
df <- data.frame(Date, Prod, Amount)
print(df)
Date Prod Amount
2010-01-28 Corn 1
2010-01-28 Potato 2
2010-02-28 Corn 3
2010-02-28 Potato 4
2010-03-28 Corn 5
2010-03-28 Potato 6
2010-04-28 Corn 7
2010-04-28 Potato 8
我想要的是计算每个变量的滚动总和,“window”为3天,然后制作一个新的数据框,等于如下:
Date Prod Amount
2010-03-28 Corn 9
2010-03-28 Potato 12
2010-04-28 Corn 15
2010-04-28 Potato 18
可能 rollapply()
和 dplyr 可以完成这项工作,但我不知道如何解决这个问题。
如果有人能提供帮助,我将不胜感激:)
我用 dplyr::lag()
做到了
library(dplyr)
library(tibble)
## Data
data <- tribble(
~Date, ~Prod, ~Amount,
"2010-01-28", "Corn", 1,
"2010-01-28", "Potato", 2,
"2010-02-28", "Corn", 3,
"2010-02-28", "Potato", 4,
"2010-03-28", "Corn", 5,
"2010-03-28", "Potato", 6,
"2010-04-28", "Corn", 7,
"2010-04-28", "Potato", 8
)
# Code
data %>%
group_by(Prod) %>%
mutate(cum_amount = Amount + lag(Amount, 1) + lag(Amount, 2)) %>%
filter(!is.na(cum_amount))
# A tibble: 4 x 4
# Groups: Prod [2]
Date Prod Amount cum_amount
<chr> <chr> <dbl> <dbl>
1 2010-03-28 Corn 5 9
2 2010-03-28 Potato 6 12
3 2010-04-28 Corn 7 15
4 2010-04-28 Potato 8 18
为了您的评论而更新
data %>%
group_by(Prod) %>%
mutate(cum_amount = c(rep(NA, 2), zoo::rollsum(Amount, 3))) %>%
filter(!is.na(cum_amount))
PS:记得在你的问题中包含 R 标签
我是 zoo 包的新手,所以这可能是一个简单的问题。 我有以下数据框 (df):
library(lubridate)
library(zoo)
library(dplyr)
Date <- c("2010-01-28", "2010-01-28", "2010-02-28",
"2010-02-28", "2010-03-28", "2010-03-28",
"2010-04-28", "2010-04-28")
Date <- as_date(Date)
Amount <- 1:8
Prod <- c("Corn", "Potato","Corn", "Potato","Corn", "Potato","Corn", "Potato")
df <- data.frame(Date, Prod, Amount)
print(df)
Date Prod Amount
2010-01-28 Corn 1
2010-01-28 Potato 2
2010-02-28 Corn 3
2010-02-28 Potato 4
2010-03-28 Corn 5
2010-03-28 Potato 6
2010-04-28 Corn 7
2010-04-28 Potato 8
我想要的是计算每个变量的滚动总和,“window”为3天,然后制作一个新的数据框,等于如下:
Date Prod Amount
2010-03-28 Corn 9
2010-03-28 Potato 12
2010-04-28 Corn 15
2010-04-28 Potato 18
可能 rollapply()
和 dplyr 可以完成这项工作,但我不知道如何解决这个问题。
如果有人能提供帮助,我将不胜感激:)
我用 dplyr::lag()
library(dplyr)
library(tibble)
## Data
data <- tribble(
~Date, ~Prod, ~Amount,
"2010-01-28", "Corn", 1,
"2010-01-28", "Potato", 2,
"2010-02-28", "Corn", 3,
"2010-02-28", "Potato", 4,
"2010-03-28", "Corn", 5,
"2010-03-28", "Potato", 6,
"2010-04-28", "Corn", 7,
"2010-04-28", "Potato", 8
)
# Code
data %>%
group_by(Prod) %>%
mutate(cum_amount = Amount + lag(Amount, 1) + lag(Amount, 2)) %>%
filter(!is.na(cum_amount))
# A tibble: 4 x 4
# Groups: Prod [2]
Date Prod Amount cum_amount
<chr> <chr> <dbl> <dbl>
1 2010-03-28 Corn 5 9
2 2010-03-28 Potato 6 12
3 2010-04-28 Corn 7 15
4 2010-04-28 Potato 8 18
为了您的评论而更新
data %>%
group_by(Prod) %>%
mutate(cum_amount = c(rep(NA, 2), zoo::rollsum(Amount, 3))) %>%
filter(!is.na(cum_amount))
PS:记得在你的问题中包含 R 标签