使用 purrr 提高批量预测的速度

Increase speed of batch forecast with purrr

我正在阅读有关批量预测的 blog post,想提高速度。我尝试使用 purrr,但只减少了不到一半的时间。下面是一个可重现的示例,显示了 Hyndman 博客 post 中的示例并显示了 purrr 替代方案。我怎样才能减少这个时间?

library(forecast)
library(tidyverse)
library(purrr)
#read file
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv",header=FALSE)

# hyndmans loop
retail <- ts(retail[,-1],f=12,s=1982+3/12)
ns <- ncol(retail)
h <- 24
fcast <- matrix(NA,nrow=h,ncol=ns)
system.time(
for(i in 1:ns)
  fcast[,i] <- forecast(retail[,i],h=h)$mean
)

#   user  system elapsed 
#  60.14    0.17   61.72 

# purrr try
system.time(
retail_forecast <- retail %>% 
  as_tibble() %>% 
  map(~ts(.,frequency = 7)) %>% 
  map_dfc(~forecast(.,h=h)$mean))

#   user  system elapsed 
#  32.23    0.03   35.32 

您可以使用 furrr 包并行化 purrr 函数。这是包页面的摘录

The goal of furrr is to simplify the combination of purrr’s family of mapping functions and future’s parallel processing capabilities. A new set of future_map_*() functions have been defined, and can be used as (hopefully) drop in replacements for the corresponding map_*() function.

The code draws heavily from the implementations of purrr and future.apply

使用 furrr 我能够在我的 Linux 机器上减少超过 3 倍的计算时间

library(forecast)
library(tidyverse)

### read file
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv", header = FALSE)

海德曼循环

retail <- ts(retail[, -1], f = 12, s = 1982 + 3 / 12)
ns <- ncol(retail)
h <- 24
fcast <- matrix(NA, nrow = h, ncol = ns)
system.time(
  for (i in 1:ns)
    fcast[, i] <- forecast(retail[, i], h = h)$mean
)

# user  system elapsed 
# 50.592   0.016  50.599
#

咕噜咕噜试试

system.time(
  retail_forecast <- retail %>%
    as_tibble() %>%
    map(~ts(., frequency = 12)) %>%
    map_dfc(~ forecast(., h = h)$mean)
)

# user  system elapsed 
# 50.232   0.000  50.224 
#

furrr 尝试

library(furrr)
#> Loading required package: future
# You set a "plan" for how the code should run. The easiest is `multiprocess`
# On Mac this picks plan(multicore) and on Windows this picks plan(multisession)
plan(multiprocess)

system.time(
  retail_forecast <- retail %>%
    as_tibble() %>%
    future_map(~ts(., frequency = 12)) %>%
    future_map_dfc(~ forecast(., h = h)$mean)
)

# user  system elapsed 
# 0.172   0.080  14.702
#

reprex package (v0.2.0.9000) 创建于 2018-08-01。