如何减少 R 程序的计算时间

Question

我正在对时间序列数据进行预测，但我正在努力减少计算时间。这是代码示例。所以代码实际上预测了不同监测站的温度。对于 134 个电台，在我的电脑上大约需要 10 分钟。我在想有没有办法减少整体的计算时间。

示例数据如下所示。共134个站点，观测2个月

date              station1        station2      station3         station4
18/01/2017 0:00     36.8           36.25           27.4            25.75
19/01/2017 0:00     30.71428571    34.6            29.4           22.33333333
20/01/2017 0:00     38.75          40.33333333     30.16666667    29.33333333
21/01/2017 0:00     40.83333333    40.33333333     31.2 3         2.25

dat1 <-read.csv("smart.csv")
library(forecast)
attach(dat1)
library(forecastHybrid)
ptm <- proc.time()
result<-data.frame(auto=0,nnetar=0)
for(i in 2:135) {
   temp.ts <-ts(dat1[i])
   train = temp.ts[1:600]
   test = temp.ts[601:620]

   hm3 <- hybridModel(train, weights = "equal", errorMethod = "MASE", models = 
"an")
   accuracy(hm3,individual = TRUE)
   hForecast <- forecast(hm3, h = 1) 
   result<-rbind(result,data.frame(auto=hForecast$pointForecasts[1],
                 nnetar=hForecast$pointForecasts[2]))
   fit_accuracy <- accuracy(hForecast, test)
}

proc.time()-ptm
write.csv(result, file= "xyz.csv")

Answer 1

鉴于示例，我假设您的数据框类似于

date<-seq(ymd_hm("2016-01-01 00:00"),ymd_hm("2017-09-11 00:00"),by="day")
station1<-runif(620)
station2<-runif(620)
station3<-runif(620)
station4<-runif(620)
dat1=data.frame(date,station1,station2,station3,station4)

如果是这种情况，您的代码会出错：

Error in testaccuracy(f, x, test, d, D) : 
  Not enough forecasts. Check that forecasts and test data match.

此错误是由循环的最后一行引起的：

fit_accuracy <- accuracy(hForecast, test)

因为 hForecast 的长度为 1，测试长度为 20。

所以我编写了以下代码，运行速度足够快：

forecastStation<-function(data){
  temp=ts(data)
  train = temp[1:600,]
  test = temp[601:620,]
  #hm3 <- hybridModel(train, weights = "equal", errorMethod = "MASE", models = "an")
  arimaModel <-auto.arima(train)
  netModel=nnetar(train)
  accuracy(arimaModel,individual = TRUE);accuracy(netModel,individual = TRUE)
  arimaPredict <- forecast(arimaModel, 1)$mean[1]
  netPredict<- forecast(netModel, 1)$mean[1]
  return(data.frame(auto=arimaPredict,nnetar=netPredict))
}
result<-do.call("rbind",lapply(2:5,function(x) FUN=forecastStation(dat1[x])))
result$Station=colnames(dat1)[2:5]

与您的主要区别在于，我没有使用 hybridModel 函数，而是单独使用 auto.arima 和 nnetar。

结果是以下形式的数据框：

> result
       auto    nnetar  Station
1 0.4995727 0.4906344 station1
2 0.4907216 0.5045967 station2
3 0.5300489 0.5413126 station3
4 0.5021821 0.4951382 station4

提前一步预测。我不确定你是想提前 1 步还是 2 步。如果是第二种情况，将函数更改为：

forecastStation<-function(data){
  temp=ts(data)
  train = temp[1:600,]
  test = temp[601:620,]
  #hm3 <- hybridModel(train, weights = "equal", errorMethod = "MASE", models = "an")
  arimaModel <-auto.arima(train)
  netModel=nnetar(train)
  accuracy(arimaModel,individual = TRUE);accuracy(netModel,individual = TRUE)
  arimaPredict <- forecast(arimaModel, 20)$mean[1:20]
  netPredict<- forecast(netModel, 20)$mean[1:20]
  return(data.frame(auto=arimaPredict,nnetar=netPredict))
}

希望这对您有所帮助

如何减少 R 程序的计算时间

How to reduce the computation time for an R program

time

r

computation

prediction