编辑:已修复——R Forecast 包中的计算不稳定?
EDIT: FIXED -- Computational instability in R Forecast package?
原问题:
我每天观察到以下时间序列数据:
series <- c(10, 25, 8, 27, 18, 21, 12, 9, 31, 18, 8, 30, 14, 13, 10, 14,
14, 14, 6, 9, 22, 21, 22, 8, 7, 6, 22, 21, 36, 16, 2, 13, 23,
40, 12, 27, 18, 10, 11, 37, 44, 30, 40, 25, 13, 11, 58, 56, 46,
39, 28, 27, 19, 20, 97, 90, 70, 73, 30, 22, 97, 34)
并想使用 R forecasts
包中的 tbats
来适应它。我还想用每周相关性对其进行建模:
library(forecast)
x.msts = msts(series,seasonal.periods = 7)
model <- tbats(x.msts)
# shows "--- loading profile ---"
Examing/plotting str
的模型揭示了 4.9e+17
的巨大拟合方差。
并且,绘制未来的预测,我们观察到巨大的波动:
> forecast(model)$mean
Multi-Seasonal Time Series:
Start: 9 7
Seasonal Periods: 7
Data:
[1] 1.483789e+44 -1.399297e+42 -2.566455e+44 -1.374316e+43 -1.527758e+38
[6] 2.036194e+42 5.639596e+42 8.231600e+40 -2.578859e+41 -1.355840e+43
这些估计是 TBATS 模型拟合程序的 "correct" 解决方案,还是 forecast
包中存在错误?如果不是错误,有人可以帮助我从数学上理解为什么这个看起来正常的时间序列会产生这些估计吗?
这是我的第一个 post 简历,如果这应该在 SO 上,我们深表歉意!
Post-答案更新:
我已在 github
上提交错误报告
还有一些人注意到我没有使用多个季节性因素,所以我想在这里表明这个错误仍然是一个问题:
x2.msts <- msts(series,seasonal.periods = c(7,30))
model_x2_1 <- tbats(x2.msts) # high variance
model_x2_2 <- tbats( series, seasonal.periods = c(7,30) ) # also high variance
这可能与 here 描述的问题相同,所以 是 的原因大概是预测包中的错误。我不确定以下替代方案是否会给你想要的结果,但你可以保留 series
原样并将 seasonal.periods=7
放在 tbats
:
的调用中
library(forecast)
series <- c(10, 25, 8, 27, 18, 21, 12, 9, 31, 18, 8, 30, 14, 13, 10, 14,
14, 14, 6, 9, 22, 21, 22, 8, 7, 6, 22, 21, 36, 16, 2, 13, 23,
40, 12, 27, 18, 10, 11, 37, 44, 30, 40, 25, 13, 11, 58, 56, 46,
39, 28, 27, 19, 20, 97, 90, 70, 73, 30, 22, 97, 34)
x.msts <- msts(series,seasonal.periods = 7)
model_1 <- tbats(x.msts)
model_2 <- tbats( series, seasonal.periods = 7 )
model_2
的方差比model_1
好很多:
> str(model_1)
List of 19
$ lambda : num 0.21
$ alpha : num 0.374
$ beta : NULL
$ damping.parameter: NULL
$ gamma.values : NULL
$ ar.coefficients : num [1:2] 1.296 -0.911
$ ma.coefficients : num [1:2] -1.62 0.98
$ likelihood : num 549
$ optim.return.code: int 0
$ variance : num 4.9e+17
$ AIC : num 571
$ parameters :List of 2
..$ vect : num [1:6] 0.21 0.374 1.296 -0.911 -1.615 ...
..$ control:List of 6
.. ..$ use.beta : logi FALSE
.. ..$ use.box.cox : logi TRUE
.. ..$ use.damping : logi FALSE
.. ..$ length.gamma: num 0
.. ..$ p : int 2
.. ..$ q : int 2
$ seed.states : num [1:5, 1] 4.16 0 0 0 0
$ fitted.values : Time-Series [1:62] from 1 to 9.71: 19.97 19.28 4.53 21.83 56.15 ...
..- attr(*, "msts")= num 7
$ errors : Time-Series [1:62] from 1 to 9.71: -1.206 0.496 0.828 0.415 -2.354 ...
..- attr(*, "msts")= num 7
$ x : num [1:5, 1:62] 3.71 -1.21 0 -1.21 0 ...
$ seasonal.periods : NULL
$ y : Time-Series [1:62] from 1 to 9.71: 10 25 8 27 18 21 12 9 31 18 ...
..- attr(*, "msts")= num 7
$ call : language tbats(y = x.msts)
- attr(*, "class")= chr "bats"
>
.
> str(model_2)
List of 23
$ lambda : num 0.198
$ alpha : num 0.198
$ beta : NULL
$ damping.parameter: NULL
$ gamma.one.values : num -0.0157
$ gamma.two.values : num 0.00991
$ ar.coefficients : NULL
$ ma.coefficients : NULL
$ likelihood : num 553
$ optim.return.code: int 0
$ variance : num 0.969
$ AIC : num 571
$ parameters :List of 2
..$ vect : num [1:4] 0.19842 0.19782 -0.0157 0.00991
..$ control:List of 6
.. ..$ use.beta : logi FALSE
.. ..$ use.box.cox : logi TRUE
.. ..$ use.damping : logi FALSE
.. ..$ length.gamma: int 2
.. ..$ p : num 0
.. ..$ q : num 0
$ seed.states : num [1:5, 1] 4.1851 0.3176 0.0103 -0.5806 0.4447
$ fitted.values : Time-Series [1:62] from 1 to 62: 25.1 20 11.1 10.2 24.3 ...
$ errors : Time-Series [1:62] from 1 to 62: -1.594 0.41 -0.507 1.697 -0.552 ...
$ x : num [1:5, 1:62] 3.87 -0.231 0.456 -0.626 -0.125 ...
$ seasonal.periods : num 7
$ k.vector : int 2
$ y : Time-Series [1:62] from 1 to 62: 10 25 8 27 18 21 12 9 31 18 ...
$ p : num 0
$ q : num 0
$ call : language tbats(y = series, seasonal.periods = 7)
- attr(*, "class")= chr [1:2] "tbats" "bats"
>
原问题:
我每天观察到以下时间序列数据:
series <- c(10, 25, 8, 27, 18, 21, 12, 9, 31, 18, 8, 30, 14, 13, 10, 14,
14, 14, 6, 9, 22, 21, 22, 8, 7, 6, 22, 21, 36, 16, 2, 13, 23,
40, 12, 27, 18, 10, 11, 37, 44, 30, 40, 25, 13, 11, 58, 56, 46,
39, 28, 27, 19, 20, 97, 90, 70, 73, 30, 22, 97, 34)
并想使用 R forecasts
包中的 tbats
来适应它。我还想用每周相关性对其进行建模:
library(forecast)
x.msts = msts(series,seasonal.periods = 7)
model <- tbats(x.msts)
# shows "--- loading profile ---"
Examing/plotting str
的模型揭示了 4.9e+17
的巨大拟合方差。
并且,绘制未来的预测,我们观察到巨大的波动:
> forecast(model)$mean
Multi-Seasonal Time Series:
Start: 9 7
Seasonal Periods: 7
Data:
[1] 1.483789e+44 -1.399297e+42 -2.566455e+44 -1.374316e+43 -1.527758e+38
[6] 2.036194e+42 5.639596e+42 8.231600e+40 -2.578859e+41 -1.355840e+43
这些估计是 TBATS 模型拟合程序的 "correct" 解决方案,还是 forecast
包中存在错误?如果不是错误,有人可以帮助我从数学上理解为什么这个看起来正常的时间序列会产生这些估计吗?
这是我的第一个 post 简历,如果这应该在 SO 上,我们深表歉意!
Post-答案更新:
我已在 github
上提交错误报告还有一些人注意到我没有使用多个季节性因素,所以我想在这里表明这个错误仍然是一个问题:
x2.msts <- msts(series,seasonal.periods = c(7,30))
model_x2_1 <- tbats(x2.msts) # high variance
model_x2_2 <- tbats( series, seasonal.periods = c(7,30) ) # also high variance
这可能与 here 描述的问题相同,所以 是 的原因大概是预测包中的错误。我不确定以下替代方案是否会给你想要的结果,但你可以保留 series
原样并将 seasonal.periods=7
放在 tbats
:
library(forecast)
series <- c(10, 25, 8, 27, 18, 21, 12, 9, 31, 18, 8, 30, 14, 13, 10, 14,
14, 14, 6, 9, 22, 21, 22, 8, 7, 6, 22, 21, 36, 16, 2, 13, 23,
40, 12, 27, 18, 10, 11, 37, 44, 30, 40, 25, 13, 11, 58, 56, 46,
39, 28, 27, 19, 20, 97, 90, 70, 73, 30, 22, 97, 34)
x.msts <- msts(series,seasonal.periods = 7)
model_1 <- tbats(x.msts)
model_2 <- tbats( series, seasonal.periods = 7 )
model_2
的方差比model_1
好很多:
> str(model_1)
List of 19
$ lambda : num 0.21
$ alpha : num 0.374
$ beta : NULL
$ damping.parameter: NULL
$ gamma.values : NULL
$ ar.coefficients : num [1:2] 1.296 -0.911
$ ma.coefficients : num [1:2] -1.62 0.98
$ likelihood : num 549
$ optim.return.code: int 0
$ variance : num 4.9e+17
$ AIC : num 571
$ parameters :List of 2
..$ vect : num [1:6] 0.21 0.374 1.296 -0.911 -1.615 ...
..$ control:List of 6
.. ..$ use.beta : logi FALSE
.. ..$ use.box.cox : logi TRUE
.. ..$ use.damping : logi FALSE
.. ..$ length.gamma: num 0
.. ..$ p : int 2
.. ..$ q : int 2
$ seed.states : num [1:5, 1] 4.16 0 0 0 0
$ fitted.values : Time-Series [1:62] from 1 to 9.71: 19.97 19.28 4.53 21.83 56.15 ...
..- attr(*, "msts")= num 7
$ errors : Time-Series [1:62] from 1 to 9.71: -1.206 0.496 0.828 0.415 -2.354 ...
..- attr(*, "msts")= num 7
$ x : num [1:5, 1:62] 3.71 -1.21 0 -1.21 0 ...
$ seasonal.periods : NULL
$ y : Time-Series [1:62] from 1 to 9.71: 10 25 8 27 18 21 12 9 31 18 ...
..- attr(*, "msts")= num 7
$ call : language tbats(y = x.msts)
- attr(*, "class")= chr "bats"
>
.
> str(model_2)
List of 23
$ lambda : num 0.198
$ alpha : num 0.198
$ beta : NULL
$ damping.parameter: NULL
$ gamma.one.values : num -0.0157
$ gamma.two.values : num 0.00991
$ ar.coefficients : NULL
$ ma.coefficients : NULL
$ likelihood : num 553
$ optim.return.code: int 0
$ variance : num 0.969
$ AIC : num 571
$ parameters :List of 2
..$ vect : num [1:4] 0.19842 0.19782 -0.0157 0.00991
..$ control:List of 6
.. ..$ use.beta : logi FALSE
.. ..$ use.box.cox : logi TRUE
.. ..$ use.damping : logi FALSE
.. ..$ length.gamma: int 2
.. ..$ p : num 0
.. ..$ q : num 0
$ seed.states : num [1:5, 1] 4.1851 0.3176 0.0103 -0.5806 0.4447
$ fitted.values : Time-Series [1:62] from 1 to 62: 25.1 20 11.1 10.2 24.3 ...
$ errors : Time-Series [1:62] from 1 to 62: -1.594 0.41 -0.507 1.697 -0.552 ...
$ x : num [1:5, 1:62] 3.87 -0.231 0.456 -0.626 -0.125 ...
$ seasonal.periods : num 7
$ k.vector : int 2
$ y : Time-Series [1:62] from 1 to 62: 10 25 8 27 18 21 12 9 31 18 ...
$ p : num 0
$ q : num 0
$ call : language tbats(y = series, seasonal.periods = 7)
- attr(*, "class")= chr [1:2] "tbats" "bats"
>