发展预测 function/loop

Question

我是 R 的初学者，感谢任何帮助或提示来开发 function/loop 以自动执行以下预测过程：这是一个虚拟数据集

> class(stack_help) 
[1] "data.frame" 
> stack_help
    OO    GG      CC      DD
1   198.12 60.56   265.5  271.24
2   145.68 52.28   328.9  427.68
3   106.48 47.08  380.24  695.60
4    83.16 43.52  443.94  934.30
5    89.72 46.68   484.6 1084.26
6    86.48 35.46  415.56  924.68
7    93.68 24.40  376.42  798.14
8   101.70 22.68  260.42  427.72
9   115.88 22.00  228.26  245.72
10  137.24 21.60   212.7  140.64
11  129.82 18.78  230.02   46.04
12  145.00 17.62  220.74   47.16
13  135.38 18.84  245.52  143.28
14  146.38 20.68  322.18  490.20
15  154.08 19.48  374.16  621.48
16  149.34 22.68  484.28  999.50
17  152.74 28.90  533.26 1223.58
18  148.62 27.44  456.76  974.44
19  158.54 23.90  417.52  820.54
20  169.96 27.08  306.16  498.02
21  152.50 33.74   283.1  309.22
22  149.68 38.44  224.54  123.82
23  149.48 38.94  215.28   30.26
24  153.38 36.24  193.18   75.46
25  155.58 37.88  243.34  228.92
26  165.84 37.00  318.08  528.58
27  171.34 38.96   393.6  707.04
28  183.60 48.20  531.62 1169.40
29  192.58 44.46  507.96 1037.22
30  207.92 43.52     435  956.96
31  228.88 47.44  399.58  788.78
32  246.14 45.74  262.84  397.66
33  228.92 45.98   240.8  255.32
34  227.52 45.22  211.44   96.02
35  232.92 43.02  203.08   62.18
36  220.16 43.88  188.56   63.74
37  221.76 46.78  210.58  131.28
38  218.94 45.10  272.36  438.64
39  221.00 47.48  351.58  689.90
40  215.82 44.68  402.82  854.80
41  222.32 43.74  435.06 1013.92
42  239.40 52.26  474.24 1128.04
43  249.86 47.62  324.92  689.40
44  240.92 49.60  289.82  538.98
45  221.04 48.40  218.74  256.80
46  191.18 47.34  192.36  136.84
47  206.28 48.66  188.22   60.60
48  226.68 48.12  174.54   58.36
49  226.76 51.66  204.26  190.58
50  223.94 53.40  272.22  454.56
51  219.42 54.50  339.26  647.94
52  219.36 54.68 #VALUE! 1040.08
53  225.94 53.06  462.82 1066.12
54  233.04 52.64  425.32  916.22
55  218.48 64.22  438.06  961.36
56  205.76 56.44  292.24  534.28
57  206.06 53.42  225.32  272.24
58  206.22 52.50   190.2  117.16
59  215.44 52.14  182.12   32.56
60  221.92 51.10  175.82   47.50

感谢任何关于改进以下过程的建议，并热衷于使用应用函数或循环函数来自动化它。

OO 列是我要使用创建预测模型的变量。
其他列是预测变量，我想测试预测是否与它们一起使用或仅与 OO 的过去数据一起使用效果更好。
我进行了 36 次观察以拟合具有 "forecast" 包中函数 auto.arima 的 Arima 模型。
该函数提供了一些模型参数 p,d,q, , 比方说 0,1,0

现在我想以自动化方式测试模型并执行以下操作：

一个。预测下一个时期，上面的数据 table 相当于第 37 行。

b。将预测结果与历史数据进行比较，第 37 行，第 OO 列。

c。从包 "forecast" 调用 accuracy 函数并与数据点行 37 进行比较。 PLus ，将错误度量存储在向量中。

d.更新 'xdata' 参数，添加历史点 37 以及 'xreg' 参数，为预测变量再增加一个月，并为下一个时期调用另一个预测并重做此过程，直到我完成 24 个预测的测试。

虽然我用包 "forecast" 安装模型，但我发现使用包 astsa 中的函数 'sarima.for' 更容易。

在代码之前，还有更多信息：

Train.OO 将是
n.ahead = 预测范围的参数：在本例中为 1 个周期
0,1,0 将是 ARIMA 模型 (p,d,q)
Train.GG 将是预测变量，即 GG 列的前 36 个观测值
newxreg 只是从数据 table 中截取 TS 对象中的一个数据点，这将成为预测的预测变量。

现在代码

fc.1 <- sarima.for( 
xdata = Train.00,    
n.ahead = 1, 0, 1, 0 , 
xreg = Train.GG, 
newxreg = window(ts(slack_help$GG, start = c(2009,1), 
frequency = 12), start = c(2012,1) , end = c(2012,11)))
fc.1                      
fc.1.acc <- accuracy(fc.1$pred, 
                 window(ts((slack_help$OO), start = c(2009,1),frequency = 
12),    start = c(2012,1), end = c(2012,1), frequency =12)

现在是第二个命令：

fc.2 <- sarima.for( 
xdata = window(ts((slack_help$OO), start = c(2009,1),frequency = 12), 
start = c(2009,1), end = c(2012,1), frequency =12),  
n.ahead = 1, 0, 1, 0 , 
xreg = window(ts((slack_help$GG), start = c(2009,1),frequency = 12), 
start = c(2009,1), end = c(2012,1), frequency =12),
newxreg = window(ts((slack_help$GG), start = c(2009,1),frequency = 12), 
start = c(2009,2), end = c(2012,2), frequency =12),

fc.2
fc.2.acc <- accuracy(fc.2$pred, 
                 window(ts((slack_help$OO), start = c(2009,1),frequency = 
12),  start = c(2012,2), end = c(2012,2), frequency =12)

fc.2.acc

我为以下预测做了这个。基本上相同的代码，只是更新了 window 函数的日期，以削减要在预测中考虑的正确时间序列。

总共调用了 24 个。

我知道这是低效的 "brute force"。但是，我对如何开始开发 function/loop 有点迷茫。感谢有关如何自动执行上述步骤的任何评论或提示。提前致谢！

Answer 1

迭代每月时间序列的最简单方法之一是使用 1/12 是一个月这一事实。例如，如果数据从 2009 年 1 月开始，那么我们可以将其等同于 2009.000。这可以在将数据设为 ts 对象 (stack_help <- ts(stack_help, start=c(2009,1), freq=12)) 后使用 timeProp <-tsp(stack_help)[1] 提取。那么2009年2月是timeProp + 1/12 = 2009.083，2009年3月是timeProp+2/12 = 2009.167。 2011 年 1 月是 timeProp+24/12 = 2011.000 等。让我们应用这个：

library(forecast)

#first define the ts-object so we don't have to repeat it every time we use it
stack_help <- ts(stack_help, start=c(2009,1), freq=12)

#extract the start date
timeProp <- tsp(stack_help)[1]

#set vector for storing accuracy measures
accuracyMeasure <- 0
#set counter for above vector
k <- 1

#star the loop with the minimum length of data you want to use to estimate model.
#In this case start with a length of 36 (Jan 2009 - Des 2011)
for(i in 36:(nrow(stack_help)-1)){

  #create new train and test set for each iteration. This way makes your code
  #clearer and more transparent and easier to maintain in the future
  train <- window(stack_help, end=timeProp+((i-1)/12))
  test <- window(stack_help, start=timeProp+i/12, end=(timeProp+i/12))

  #Estimate the model. This can be changed to e.g. auto.arima(). Haven't tried
  #with sarima.for, but should be straight forward to use that as well.
  arrdarr <- Arima(train[,1], order=c(0,1,0), xreg=train[,2])

  #Forecast h=1 with the new xreg (h=1 is automatic since nrow(test)=1)
  foreArima <- forecast(arrdarr, xreg=test[,2])

  #Extract the test MAPE accuracy. 2 selects the test accuracy. "MAPE" can be changed to extract
  #others, e.g. #ME", "MASE". Remember that with h=1, you can scratch the "mean" part of the measures.
  accuracyMeasure[k] <- accuracy(foreArima, test[,1])[2,"MAPE"]

  k <- k+1
}

发展预测 function/loop

Developing forecasting function/loop

loops

r

function

forecasting

arima