如何将预测列表转换为 R 中的规范化 table

Question

我正在使用 auto.arima() 加上 lapply 在 R 中进行一些时间序列预测，以便为一堆商店生成大量预测，如下所示：

my_data_set 是一个小标题列表，每个小标题包含商店名称、日期（每月）和销售额

mod <- function(x) forecast(auto.arima(x$Value), h)
all_arima <- lapply(my_data_set, mod)

mod 是预测列表（每个商店一个）

我正在努力将其转化为更具消费性的输出，其中每个 store/forecast 周期有一行，upper/lower 置信区间 (80, 95) 有一列，置信区间有一列平均预测。

如果有更好的从一开始就设置它的方法，我会喜欢关于如何以不同方式处理它的建议。

Answer 1

forecast 的输出是 list。我们可以提取组件并转换为 data.frame

library(forecast)
mod <- function(x) {
       frcst <- forecast(auto.arima(x$Value), h)
       data.frame(Mean = as.numeric(frcst$mean), 
                  lower = as.numeric(frcst$lower[, "95%"]),
                  upper = as.numeric(frcst$upper[, "95%]))
   }

然后应用函数

lapply(my_data_set, mod)

Answer 2

您可以在此处使用两种方法：(1) 按照建议使用预测包； (2) 使用专门针对这个问题设计的fable包。

首先，让我们创建一些示例合成数据。

library(tibble)
library(dplyr)
df <- tibble(
  Store = rep(c("A", "B"), c(200,200)),
  Month = rep(seq(as.Date("1995-01-01"), length=200, by="1 month"), 2),
  Value = rnorm(400)
)

对于预测包，我们会将数据拆分为一个小标题列表。我们可以使用 as.data.frame() 函数来简化将预测 object 转换为数据框的过程。

# Using forecast package
library(forecast)
my_data_set <- split(df, df$Store)
mod <- function(x) {
  x$Value %>%
    ts(frequency=12, start=lubridate::year(x$Month[1])) %>%
    auto.arima() %>%
    forecast() %>%
    as.data.frame()
}
lapply(my_data_set, mod)
#> $A
#>          Point Forecast     Lo 80    Hi 80  Lo 95 Hi 95
#> Sep 2011              0 -1.327999 1.327999 -2.031 2.031
#> Oct 2011              0 -1.327999 1.327999 -2.031 2.031
#> Nov 2011              0 -1.327999 1.327999 -2.031 2.031
#> Dec 2011              0 -1.327999 1.327999 -2.031 2.031
#> Jan 2012              0 -1.327999 1.327999 -2.031 2.031
#> Feb 2012              0 -1.327999 1.327999 -2.031 2.031
#> Mar 2012              0 -1.327999 1.327999 -2.031 2.031
#> Apr 2012              0 -1.327999 1.327999 -2.031 2.031
#> May 2012              0 -1.327999 1.327999 -2.031 2.031
#> Jun 2012              0 -1.327999 1.327999 -2.031 2.031
#> Jul 2012              0 -1.327999 1.327999 -2.031 2.031
#> Aug 2012              0 -1.327999 1.327999 -2.031 2.031
#> Sep 2012              0 -1.327999 1.327999 -2.031 2.031
#> Oct 2012              0 -1.327999 1.327999 -2.031 2.031
#> Nov 2012              0 -1.327999 1.327999 -2.031 2.031
#> Dec 2012              0 -1.327999 1.327999 -2.031 2.031
#> Jan 2013              0 -1.327999 1.327999 -2.031 2.031
#> Feb 2013              0 -1.327999 1.327999 -2.031 2.031
#> Mar 2013              0 -1.327999 1.327999 -2.031 2.031
#> Apr 2013              0 -1.327999 1.327999 -2.031 2.031
#> May 2013              0 -1.327999 1.327999 -2.031 2.031
#> Jun 2013              0 -1.327999 1.327999 -2.031 2.031
#> Jul 2013              0 -1.327999 1.327999 -2.031 2.031
#> Aug 2013              0 -1.327999 1.327999 -2.031 2.031
#> 
#> $B
#>          Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
#> Sep 2011              0 -1.274651 1.274651 -1.949411 1.949411
#> Oct 2011              0 -1.274651 1.274651 -1.949411 1.949411
#> Nov 2011              0 -1.274651 1.274651 -1.949411 1.949411
#> Dec 2011              0 -1.274651 1.274651 -1.949411 1.949411
#> Jan 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Feb 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Mar 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Apr 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> May 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Jun 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Jul 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Aug 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Sep 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Oct 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Nov 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Dec 2012              0 -1.274651 1.274651 -1.949411 1.949411
#> Jan 2013              0 -1.274651 1.274651 -1.949411 1.949411
#> Feb 2013              0 -1.274651 1.274651 -1.949411 1.949411
#> Mar 2013              0 -1.274651 1.274651 -1.949411 1.949411
#> Apr 2013              0 -1.274651 1.274651 -1.949411 1.949411
#> May 2013              0 -1.274651 1.274651 -1.949411 1.949411
#> Jun 2013              0 -1.274651 1.274651 -1.949411 1.949411
#> Jul 2013              0 -1.274651 1.274651 -1.949411 1.949411
#> Aug 2013              0 -1.274651 1.274651 -1.949411 1.949411

要使用 fable 包，我们可以只获取包含所有商店的原始数据框，并将其转换为 tsibble object，然后将其通过管道传输到模型和预测中，如下所示。

# Using fable
library(tsibble)
library(fable)
df %>%
  mutate(Month = yearmonth(Month)) %>%
  as_tsibble(index=Month, key=Store) %>%
  model(ARIMA(Value)) %>%
  forecast() %>%
  mutate(
    pi80 = hilo(Value, 80),
    pi95 = hilo(Value, 95)
  ) %>%
  unpack_hilo(cols = c(pi80, pi95))
#> # A fable: 48 x 9 [1M]
#> # Key:     Store, .model [2]
#>    Store .model          Month     Value .mean pi80_lower pi80_upper pi95_lower
#>    <chr> <chr>           <mth>    <dist> <dbl>      <dbl>      <dbl>      <dbl>
#>  1 A     ARIMA(Value) 2011 Sep N(0, 1.1)     0      -1.33       1.33      -2.03
#>  2 A     ARIMA(Value) 2011 Oct N(0, 1.1)     0      -1.33       1.33      -2.03
#>  3 A     ARIMA(Value) 2011 Nov N(0, 1.1)     0      -1.33       1.33      -2.03
#>  4 A     ARIMA(Value) 2011 Dec N(0, 1.1)     0      -1.33       1.33      -2.03
#>  5 A     ARIMA(Value) 2012 Jan N(0, 1.1)     0      -1.33       1.33      -2.03
#>  6 A     ARIMA(Value) 2012 Feb N(0, 1.1)     0      -1.33       1.33      -2.03
#>  7 A     ARIMA(Value) 2012 Mar N(0, 1.1)     0      -1.33       1.33      -2.03
#>  8 A     ARIMA(Value) 2012 Apr N(0, 1.1)     0      -1.33       1.33      -2.03
#>  9 A     ARIMA(Value) 2012 May N(0, 1.1)     0      -1.33       1.33      -2.03
#> 10 A     ARIMA(Value) 2012 Jun N(0, 1.1)     0      -1.33       1.33      -2.03
#> # … with 38 more rows, and 1 more variable: pi95_upper <dbl>

^{由 reprex package (v2.0.1)}

于 2021-09-30 创建

这种方法也更加灵活，因为可能有多个分组变量（例如商店和产品）。寓言方法记录在 https://OTexts.com/fpp3.

的开放获取教科书中

如何将预测列表转换为 R 中的规范化 table

How to convert a list of forecasts into a normalized table in R

r

time-series

forecasting

arima