'augment()' fabletools 中的函数

'augment()' function in fabletools

我正在尝试使用 fabletools 包提取预测残差。我知道我可以使用 augment() 函数提取拟合模型残差,但我不知道它对预测值的作用如何,我得到的结果与拟合模型残差相同。这是一个例子:

library(fable)
library(tsibble)
 lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths))

## fitted model residuals
 lung_deaths %>%
    dplyr::filter(index < yearmonth("1979 Jan")) %>%
    model(
      ets = ETS(value ~ error("M") + trend("A") + season("A"))) %>%
      augment()   
# A tsibble: 120 x 7 [1M]
# Key:       key, .model [2]
   key     .model    index value .fitted  .resid   .innov
   <chr>   <chr>     <mth> <dbl>   <dbl>   <dbl>    <dbl>
 1 fdeaths ets    1974 Jan   901    837.   64.0   0.0765 
 2 fdeaths ets    1974 Feb   689    877. -188.   -0.214  
 3 fdeaths ets    1974 Mar   827    795.   31.7   0.0399 
 4 fdeaths ets    1974 Apr   677    624.   53.2   0.0852 
 5 fdeaths ets    1974 May   522    515.    7.38  0.0144 
 6 fdeaths ets    1974 Jun   406    453.  -47.0  -0.104  
 7 fdeaths ets    1974 Jul   441    431.    9.60  0.0223 
 8 fdeaths ets    1974 Aug   393    388.    4.96  0.0128 
 9 fdeaths ets    1974 Sep   387    384.    2.57  0.00668
10 fdeaths ets    1974 Oct   582    480.  102.    0.212  
# ... with 110 more rows

## forecast residuals
test <- lung_deaths %>%
    dplyr::filter(index < yearmonth("1979 Jan")) %>%
    model(
      ets = ETS(value ~ error("M") + trend("A") + season("A"))) %>%
      forecast(h = "1 year") 
## defining newdata
  Data <- lung_deaths %>%
      dplyr::filter(index >= yearmonth("1979 Jan"))

augment(test, newdata = Data, type.predict='response')
# A tsibble: 120 x 7 [1M]
# Key:       key, .model [2]
   key     .model    index value .fitted  .resid   .innov
   <chr>   <chr>     <mth> <dbl>   <dbl>   <dbl>    <dbl>
 1 fdeaths ets    1974 Jan   901    837.   64.0   0.0765 
 2 fdeaths ets    1974 Feb   689    877. -188.   -0.214  
 3 fdeaths ets    1974 Mar   827    795.   31.7   0.0399 
 4 fdeaths ets    1974 Apr   677    624.   53.2   0.0852 
 5 fdeaths ets    1974 May   522    515.    7.38  0.0144 
 6 fdeaths ets    1974 Jun   406    453.  -47.0  -0.104  
 7 fdeaths ets    1974 Jul   441    431.    9.60  0.0223 
 8 fdeaths ets    1974 Aug   393    388.    4.96  0.0128 
 9 fdeaths ets    1974 Sep   387    384.    2.57  0.00668
10 fdeaths ets    1974 Oct   582    480.  102.    0.212  
# ... with 110 more rows

如有任何建议,我们将不胜感激。

我想您可能想要预测误差 --- 观察到的和预测到的之间的差异。有关讨论,请参阅 https://otexts.com/fpp3/accuracy.html。引用那一章:

Note that forecast errors are different from residuals in two ways. First, residuals are calculated on the training set while forecast errors are calculated on the test set. Second, residuals are based on one-step forecasts while forecast errors can involve multi-step forecasts.

下面是一些代码,用于计算您的示例的预测误差。

library(fable)
library(tsibble)
library(dplyr)

lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths))

## forecasts
fcast <- lung_deaths %>%
  dplyr::filter(index < yearmonth("1979 Jan")) %>%
  model(
    ets = ETS(value ~ error("M") + trend("A") + season("A"))
  ) %>%
  forecast(h = "1 year") 

## defining newdata
new_data <- lung_deaths %>%
  dplyr::filter(index >= yearmonth("1979 Jan")) %>%
  rename(actual = value)

# Compute forecast errors
fcast %>%
  left_join(new_data) %>%
  mutate(error = actual - .mean)
#> Joining, by = c("key", "index")
#> # A fable: 24 x 7 [1M]
#> # Key:     key, .model [2]
#>    key     .model    index        value .mean actual error
#>    <chr>   <chr>     <mth>       <dist> <dbl>  <dbl> <dbl>
#>  1 fdeaths ets    1979 Jan N(783, 8522)  783.    821  37.5
#>  2 fdeaths ets    1979 Feb N(823, 9412)  823.    785 -38.4
#>  3 fdeaths ets    1979 Mar N(742, 7639)  742.    727 -14.8
#>  4 fdeaths ets    1979 Apr N(570, 4516)  570.    612  41.7
#>  5 fdeaths ets    1979 May N(461, 2951)  461.    478  16.9
#>  6 fdeaths ets    1979 Jun N(400, 2216)  400.    429  29.5
#>  7 fdeaths ets    1979 Jul N(378, 1982)  378.    405  27.1
#>  8 fdeaths ets    1979 Aug N(335, 1553)  335.    379  44.5
#>  9 fdeaths ets    1979 Sep N(331, 1520)  331.    393  62.1
#> 10 fdeaths ets    1979 Oct N(427, 2527)  427.    411 -15.7
#> # … with 14 more rows

reprex package (v0.3.0)

于 2020-11-03 创建