使用每日值进行时间序列预测
Time series Forecasting with Daily values
我正在使用 auto.Arima 和单变量数据进行预测,但我的预测不正确。我已正确使用所有步骤,但点预测值并不正确。请帮助我。
这是我的数据:
s <- read.csv(url('https://ondemand.websol.barchart.com/getHistory.csv?apikey=c3122f072488a29c5279680b9a2cf88e&symbol=zs*1&type=dailyNearest&backAdjust=false&startDate=20100201'))
这是我的代码:
data <- s[c(3, 7)]
summary(data)
data1.ts <- zoo(data[,2], seq(from = as.Date("2010-02-01"), to = as.Date("2022-05-13"), by = 1))
autoplot(data1.ts)
华宇模型:
fit_arima <- auto.arima(data1.ts, stepwise = FALSE, approximation = FALSE, trace = TRUE)
print(summary(fit_arima))
checkresiduals(fit_arima)
forecast_Arima <- forecast(fit_arima, h = 1)
forecast_Arima
预测值:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
19126 976.4357 949.874 1002.997 935.813 1017.058
小更新:
我尝试将数据加载为 ts 对象并获得了准确的点预测值,但是我的预测年份不正确。超前一步预测为我提供了 2021 年的价值,但我的结束日期是 2022 年 5 月 13 日。我只想更正年份。这是新代码:
ts_soy <- ts(data[,2], start = c(2010-02-01), frequency = 214)
autoplot(ts_soy)
fit_arima <- auto.arima(ts_soy)
print(summary(fit_arima))
checkresiduals(fit_arima)
forecast_Arima <- forecast(fit_arima, h = 1)
forecast_Arima
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2021.472 1646.5 1625.071 1667.929 1613.727 1679.273
我可以重现你的问题,原因是你的data1.ts
包含了太多的数据。您正试图摆脱周末以创建连续的时间序列(又名没有间隙的时间序列)。原则是正确的,但您超出了 1388 条记录的记录量。由于 R 倾向于回收价值,您会再次获得早年的收盘价,这会影响 arima
函数。
你可以做一些事情,比如从最早的日期开始创建一个时间序列,然后是这个日期 + 记录数 - 1
data.ts <- zoo(data[,2], seq(from = as.Date("2010-02-01"),
to = as.Date("2010-02-01") + 3096,
by = 1))
fit_ar <- forecast::auto.arima(data.ts, stepwise = FALSE, approximation = FALSE)
forecast::forecast(fit_ar, h = 1)
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
17738 1648.129 1626.759 1669.499 1615.446 1680.812
这也是我比较喜欢用fable的原因之一,可以更好的考察数据。
library(fpp3)
library(fable.prophet)
fit <- data %>%
mutate(id = row_number()) %>% # create index to use otherwise timeseries has gaps
tsibble(index = id) %>%
model(naive = NAIVE(close),
arima = ARIMA(close, stepwise = FALSE, approximation = FALSE),
)
forecast(fit, h = 1)
# A fable: 2 x 4 [1]
# Key: .model [2]
.model id close .mean
<chr> <dbl> <dist> <dbl>
1 naive 3098 N(1646, 280) 1646.
2 arima 3098 N(1649, 278) 1649.
# prophet needs dates and can handle weekends
prophet_fit <- data %>%
mutate(tradingDay = ymd(tradingDay)) %>%
tsibble() %>%
model(prophet_model = prophet(close))
forecast(prophet_fit, h = 1)
# A fable: 1 x 4 [1D]
# Key: .model [1]
.model tradingDay close .mean
<chr> <date> <dist> <dbl>
1 prophet_model 2022-05-14 sample[5000] 1657.
我正在使用 auto.Arima 和单变量数据进行预测,但我的预测不正确。我已正确使用所有步骤,但点预测值并不正确。请帮助我。
这是我的数据:
s <- read.csv(url('https://ondemand.websol.barchart.com/getHistory.csv?apikey=c3122f072488a29c5279680b9a2cf88e&symbol=zs*1&type=dailyNearest&backAdjust=false&startDate=20100201'))
这是我的代码:
data <- s[c(3, 7)]
summary(data)
data1.ts <- zoo(data[,2], seq(from = as.Date("2010-02-01"), to = as.Date("2022-05-13"), by = 1))
autoplot(data1.ts)
华宇模型:
fit_arima <- auto.arima(data1.ts, stepwise = FALSE, approximation = FALSE, trace = TRUE)
print(summary(fit_arima))
checkresiduals(fit_arima)
forecast_Arima <- forecast(fit_arima, h = 1)
forecast_Arima
预测值:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
19126 976.4357 949.874 1002.997 935.813 1017.058
小更新:
我尝试将数据加载为 ts 对象并获得了准确的点预测值,但是我的预测年份不正确。超前一步预测为我提供了 2021 年的价值,但我的结束日期是 2022 年 5 月 13 日。我只想更正年份。这是新代码:
ts_soy <- ts(data[,2], start = c(2010-02-01), frequency = 214)
autoplot(ts_soy)
fit_arima <- auto.arima(ts_soy)
print(summary(fit_arima))
checkresiduals(fit_arima)
forecast_Arima <- forecast(fit_arima, h = 1)
forecast_Arima
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2021.472 1646.5 1625.071 1667.929 1613.727 1679.273
我可以重现你的问题,原因是你的data1.ts
包含了太多的数据。您正试图摆脱周末以创建连续的时间序列(又名没有间隙的时间序列)。原则是正确的,但您超出了 1388 条记录的记录量。由于 R 倾向于回收价值,您会再次获得早年的收盘价,这会影响 arima
函数。
你可以做一些事情,比如从最早的日期开始创建一个时间序列,然后是这个日期 + 记录数 - 1
data.ts <- zoo(data[,2], seq(from = as.Date("2010-02-01"),
to = as.Date("2010-02-01") + 3096,
by = 1))
fit_ar <- forecast::auto.arima(data.ts, stepwise = FALSE, approximation = FALSE)
forecast::forecast(fit_ar, h = 1)
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
17738 1648.129 1626.759 1669.499 1615.446 1680.812
这也是我比较喜欢用fable的原因之一,可以更好的考察数据。
library(fpp3)
library(fable.prophet)
fit <- data %>%
mutate(id = row_number()) %>% # create index to use otherwise timeseries has gaps
tsibble(index = id) %>%
model(naive = NAIVE(close),
arima = ARIMA(close, stepwise = FALSE, approximation = FALSE),
)
forecast(fit, h = 1)
# A fable: 2 x 4 [1]
# Key: .model [2]
.model id close .mean
<chr> <dbl> <dist> <dbl>
1 naive 3098 N(1646, 280) 1646.
2 arima 3098 N(1649, 278) 1649.
# prophet needs dates and can handle weekends
prophet_fit <- data %>%
mutate(tradingDay = ymd(tradingDay)) %>%
tsibble() %>%
model(prophet_model = prophet(close))
forecast(prophet_fit, h = 1)
# A fable: 1 x 4 [1D]
# Key: .model [1]
.model tradingDay close .mean
<chr> <date> <dist> <dbl>
1 prophet_model 2022-05-14 sample[5000] 1657.