'augment()' fabletools 中的函数
'augment()' function in fabletools
我正在尝试使用 fabletools 包提取预测残差。我知道我可以使用 augment()
函数提取拟合模型残差,但我不知道它对预测值的作用如何,我得到的结果与拟合模型残差相同。这是一个例子:
library(fable)
library(tsibble)
lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths))
## fitted model residuals
lung_deaths %>%
dplyr::filter(index < yearmonth("1979 Jan")) %>%
model(
ets = ETS(value ~ error("M") + trend("A") + season("A"))) %>%
augment()
# A tsibble: 120 x 7 [1M]
# Key: key, .model [2]
key .model index value .fitted .resid .innov
<chr> <chr> <mth> <dbl> <dbl> <dbl> <dbl>
1 fdeaths ets 1974 Jan 901 837. 64.0 0.0765
2 fdeaths ets 1974 Feb 689 877. -188. -0.214
3 fdeaths ets 1974 Mar 827 795. 31.7 0.0399
4 fdeaths ets 1974 Apr 677 624. 53.2 0.0852
5 fdeaths ets 1974 May 522 515. 7.38 0.0144
6 fdeaths ets 1974 Jun 406 453. -47.0 -0.104
7 fdeaths ets 1974 Jul 441 431. 9.60 0.0223
8 fdeaths ets 1974 Aug 393 388. 4.96 0.0128
9 fdeaths ets 1974 Sep 387 384. 2.57 0.00668
10 fdeaths ets 1974 Oct 582 480. 102. 0.212
# ... with 110 more rows
## forecast residuals
test <- lung_deaths %>%
dplyr::filter(index < yearmonth("1979 Jan")) %>%
model(
ets = ETS(value ~ error("M") + trend("A") + season("A"))) %>%
forecast(h = "1 year")
## defining newdata
Data <- lung_deaths %>%
dplyr::filter(index >= yearmonth("1979 Jan"))
augment(test, newdata = Data, type.predict='response')
# A tsibble: 120 x 7 [1M]
# Key: key, .model [2]
key .model index value .fitted .resid .innov
<chr> <chr> <mth> <dbl> <dbl> <dbl> <dbl>
1 fdeaths ets 1974 Jan 901 837. 64.0 0.0765
2 fdeaths ets 1974 Feb 689 877. -188. -0.214
3 fdeaths ets 1974 Mar 827 795. 31.7 0.0399
4 fdeaths ets 1974 Apr 677 624. 53.2 0.0852
5 fdeaths ets 1974 May 522 515. 7.38 0.0144
6 fdeaths ets 1974 Jun 406 453. -47.0 -0.104
7 fdeaths ets 1974 Jul 441 431. 9.60 0.0223
8 fdeaths ets 1974 Aug 393 388. 4.96 0.0128
9 fdeaths ets 1974 Sep 387 384. 2.57 0.00668
10 fdeaths ets 1974 Oct 582 480. 102. 0.212
# ... with 110 more rows
如有任何建议,我们将不胜感激。
我想您可能想要预测误差 --- 观察到的和预测到的之间的差异。有关讨论,请参阅 https://otexts.com/fpp3/accuracy.html。引用那一章:
Note that forecast errors are different from residuals in two ways. First, residuals are calculated on the training set while forecast errors are calculated on the test set. Second, residuals are based on one-step forecasts while forecast errors can involve multi-step forecasts.
下面是一些代码,用于计算您的示例的预测误差。
library(fable)
library(tsibble)
library(dplyr)
lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths))
## forecasts
fcast <- lung_deaths %>%
dplyr::filter(index < yearmonth("1979 Jan")) %>%
model(
ets = ETS(value ~ error("M") + trend("A") + season("A"))
) %>%
forecast(h = "1 year")
## defining newdata
new_data <- lung_deaths %>%
dplyr::filter(index >= yearmonth("1979 Jan")) %>%
rename(actual = value)
# Compute forecast errors
fcast %>%
left_join(new_data) %>%
mutate(error = actual - .mean)
#> Joining, by = c("key", "index")
#> # A fable: 24 x 7 [1M]
#> # Key: key, .model [2]
#> key .model index value .mean actual error
#> <chr> <chr> <mth> <dist> <dbl> <dbl> <dbl>
#> 1 fdeaths ets 1979 Jan N(783, 8522) 783. 821 37.5
#> 2 fdeaths ets 1979 Feb N(823, 9412) 823. 785 -38.4
#> 3 fdeaths ets 1979 Mar N(742, 7639) 742. 727 -14.8
#> 4 fdeaths ets 1979 Apr N(570, 4516) 570. 612 41.7
#> 5 fdeaths ets 1979 May N(461, 2951) 461. 478 16.9
#> 6 fdeaths ets 1979 Jun N(400, 2216) 400. 429 29.5
#> 7 fdeaths ets 1979 Jul N(378, 1982) 378. 405 27.1
#> 8 fdeaths ets 1979 Aug N(335, 1553) 335. 379 44.5
#> 9 fdeaths ets 1979 Sep N(331, 1520) 331. 393 62.1
#> 10 fdeaths ets 1979 Oct N(427, 2527) 427. 411 -15.7
#> # … with 14 more rows
由 reprex package (v0.3.0)
于 2020-11-03 创建
我正在尝试使用 fabletools 包提取预测残差。我知道我可以使用 augment()
函数提取拟合模型残差,但我不知道它对预测值的作用如何,我得到的结果与拟合模型残差相同。这是一个例子:
library(fable)
library(tsibble)
lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths))
## fitted model residuals
lung_deaths %>%
dplyr::filter(index < yearmonth("1979 Jan")) %>%
model(
ets = ETS(value ~ error("M") + trend("A") + season("A"))) %>%
augment()
# A tsibble: 120 x 7 [1M]
# Key: key, .model [2]
key .model index value .fitted .resid .innov
<chr> <chr> <mth> <dbl> <dbl> <dbl> <dbl>
1 fdeaths ets 1974 Jan 901 837. 64.0 0.0765
2 fdeaths ets 1974 Feb 689 877. -188. -0.214
3 fdeaths ets 1974 Mar 827 795. 31.7 0.0399
4 fdeaths ets 1974 Apr 677 624. 53.2 0.0852
5 fdeaths ets 1974 May 522 515. 7.38 0.0144
6 fdeaths ets 1974 Jun 406 453. -47.0 -0.104
7 fdeaths ets 1974 Jul 441 431. 9.60 0.0223
8 fdeaths ets 1974 Aug 393 388. 4.96 0.0128
9 fdeaths ets 1974 Sep 387 384. 2.57 0.00668
10 fdeaths ets 1974 Oct 582 480. 102. 0.212
# ... with 110 more rows
## forecast residuals
test <- lung_deaths %>%
dplyr::filter(index < yearmonth("1979 Jan")) %>%
model(
ets = ETS(value ~ error("M") + trend("A") + season("A"))) %>%
forecast(h = "1 year")
## defining newdata
Data <- lung_deaths %>%
dplyr::filter(index >= yearmonth("1979 Jan"))
augment(test, newdata = Data, type.predict='response')
# A tsibble: 120 x 7 [1M]
# Key: key, .model [2]
key .model index value .fitted .resid .innov
<chr> <chr> <mth> <dbl> <dbl> <dbl> <dbl>
1 fdeaths ets 1974 Jan 901 837. 64.0 0.0765
2 fdeaths ets 1974 Feb 689 877. -188. -0.214
3 fdeaths ets 1974 Mar 827 795. 31.7 0.0399
4 fdeaths ets 1974 Apr 677 624. 53.2 0.0852
5 fdeaths ets 1974 May 522 515. 7.38 0.0144
6 fdeaths ets 1974 Jun 406 453. -47.0 -0.104
7 fdeaths ets 1974 Jul 441 431. 9.60 0.0223
8 fdeaths ets 1974 Aug 393 388. 4.96 0.0128
9 fdeaths ets 1974 Sep 387 384. 2.57 0.00668
10 fdeaths ets 1974 Oct 582 480. 102. 0.212
# ... with 110 more rows
如有任何建议,我们将不胜感激。
我想您可能想要预测误差 --- 观察到的和预测到的之间的差异。有关讨论,请参阅 https://otexts.com/fpp3/accuracy.html。引用那一章:
Note that forecast errors are different from residuals in two ways. First, residuals are calculated on the training set while forecast errors are calculated on the test set. Second, residuals are based on one-step forecasts while forecast errors can involve multi-step forecasts.
下面是一些代码,用于计算您的示例的预测误差。
library(fable)
library(tsibble)
library(dplyr)
lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths))
## forecasts
fcast <- lung_deaths %>%
dplyr::filter(index < yearmonth("1979 Jan")) %>%
model(
ets = ETS(value ~ error("M") + trend("A") + season("A"))
) %>%
forecast(h = "1 year")
## defining newdata
new_data <- lung_deaths %>%
dplyr::filter(index >= yearmonth("1979 Jan")) %>%
rename(actual = value)
# Compute forecast errors
fcast %>%
left_join(new_data) %>%
mutate(error = actual - .mean)
#> Joining, by = c("key", "index")
#> # A fable: 24 x 7 [1M]
#> # Key: key, .model [2]
#> key .model index value .mean actual error
#> <chr> <chr> <mth> <dist> <dbl> <dbl> <dbl>
#> 1 fdeaths ets 1979 Jan N(783, 8522) 783. 821 37.5
#> 2 fdeaths ets 1979 Feb N(823, 9412) 823. 785 -38.4
#> 3 fdeaths ets 1979 Mar N(742, 7639) 742. 727 -14.8
#> 4 fdeaths ets 1979 Apr N(570, 4516) 570. 612 41.7
#> 5 fdeaths ets 1979 May N(461, 2951) 461. 478 16.9
#> 6 fdeaths ets 1979 Jun N(400, 2216) 400. 429 29.5
#> 7 fdeaths ets 1979 Jul N(378, 1982) 378. 405 27.1
#> 8 fdeaths ets 1979 Aug N(335, 1553) 335. 379 44.5
#> 9 fdeaths ets 1979 Sep N(331, 1520) 331. 393 62.1
#> 10 fdeaths ets 1979 Oct N(427, 2527) 427. 411 -15.7
#> # … with 14 more rows
由 reprex package (v0.3.0)
于 2020-11-03 创建