使用 purrr 提高批量预测的速度
Increase speed of batch forecast with purrr
我正在阅读有关批量预测的 blog post,想提高速度。我尝试使用 purrr
,但只减少了不到一半的时间。下面是一个可重现的示例,显示了 Hyndman 博客 post 中的示例并显示了 purrr
替代方案。我怎样才能减少这个时间?
library(forecast)
library(tidyverse)
library(purrr)
#read file
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv",header=FALSE)
# hyndmans loop
retail <- ts(retail[,-1],f=12,s=1982+3/12)
ns <- ncol(retail)
h <- 24
fcast <- matrix(NA,nrow=h,ncol=ns)
system.time(
for(i in 1:ns)
fcast[,i] <- forecast(retail[,i],h=h)$mean
)
# user system elapsed
# 60.14 0.17 61.72
# purrr try
system.time(
retail_forecast <- retail %>%
as_tibble() %>%
map(~ts(.,frequency = 7)) %>%
map_dfc(~forecast(.,h=h)$mean))
# user system elapsed
# 32.23 0.03 35.32
您可以使用 furrr
包并行化 purrr
函数。这是包页面的摘录
The goal of furrr
is to simplify the combination of purrr
’s family of mapping functions and future
’s parallel processing capabilities. A new set of future_map_*()
functions have been defined, and can be used as (hopefully) drop in replacements for the corresponding map_*()
function.
The code draws heavily from the implementations of purrr
and future.apply
使用 furrr
我能够在我的 Linux 机器上减少超过 3 倍的计算时间
library(forecast)
library(tidyverse)
### read file
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv", header = FALSE)
海德曼循环
retail <- ts(retail[, -1], f = 12, s = 1982 + 3 / 12)
ns <- ncol(retail)
h <- 24
fcast <- matrix(NA, nrow = h, ncol = ns)
system.time(
for (i in 1:ns)
fcast[, i] <- forecast(retail[, i], h = h)$mean
)
# user system elapsed
# 50.592 0.016 50.599
#
咕噜咕噜试试
system.time(
retail_forecast <- retail %>%
as_tibble() %>%
map(~ts(., frequency = 12)) %>%
map_dfc(~ forecast(., h = h)$mean)
)
# user system elapsed
# 50.232 0.000 50.224
#
furrr 尝试
library(furrr)
#> Loading required package: future
# You set a "plan" for how the code should run. The easiest is `multiprocess`
# On Mac this picks plan(multicore) and on Windows this picks plan(multisession)
plan(multiprocess)
system.time(
retail_forecast <- retail %>%
as_tibble() %>%
future_map(~ts(., frequency = 12)) %>%
future_map_dfc(~ forecast(., h = h)$mean)
)
# user system elapsed
# 0.172 0.080 14.702
#
由 reprex package (v0.2.0.9000) 创建于 2018-08-01。
我正在阅读有关批量预测的 blog post,想提高速度。我尝试使用 purrr
,但只减少了不到一半的时间。下面是一个可重现的示例,显示了 Hyndman 博客 post 中的示例并显示了 purrr
替代方案。我怎样才能减少这个时间?
library(forecast)
library(tidyverse)
library(purrr)
#read file
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv",header=FALSE)
# hyndmans loop
retail <- ts(retail[,-1],f=12,s=1982+3/12)
ns <- ncol(retail)
h <- 24
fcast <- matrix(NA,nrow=h,ncol=ns)
system.time(
for(i in 1:ns)
fcast[,i] <- forecast(retail[,i],h=h)$mean
)
# user system elapsed
# 60.14 0.17 61.72
# purrr try
system.time(
retail_forecast <- retail %>%
as_tibble() %>%
map(~ts(.,frequency = 7)) %>%
map_dfc(~forecast(.,h=h)$mean))
# user system elapsed
# 32.23 0.03 35.32
您可以使用 furrr
包并行化 purrr
函数。这是包页面的摘录
The goal of
furrr
is to simplify the combination ofpurrr
’s family of mapping functions andfuture
’s parallel processing capabilities. A new set offuture_map_*()
functions have been defined, and can be used as (hopefully) drop in replacements for the correspondingmap_*()
function.The code draws heavily from the implementations of
purrr
andfuture.apply
使用 furrr
我能够在我的 Linux 机器上减少超过 3 倍的计算时间
library(forecast)
library(tidyverse)
### read file
retail <- read.csv("https://robjhyndman.com/data/ausretail.csv", header = FALSE)
海德曼循环
retail <- ts(retail[, -1], f = 12, s = 1982 + 3 / 12)
ns <- ncol(retail)
h <- 24
fcast <- matrix(NA, nrow = h, ncol = ns)
system.time(
for (i in 1:ns)
fcast[, i] <- forecast(retail[, i], h = h)$mean
)
# user system elapsed
# 50.592 0.016 50.599
#
咕噜咕噜试试
system.time(
retail_forecast <- retail %>%
as_tibble() %>%
map(~ts(., frequency = 12)) %>%
map_dfc(~ forecast(., h = h)$mean)
)
# user system elapsed
# 50.232 0.000 50.224
#
furrr 尝试
library(furrr)
#> Loading required package: future
# You set a "plan" for how the code should run. The easiest is `multiprocess`
# On Mac this picks plan(multicore) and on Windows this picks plan(multisession)
plan(multiprocess)
system.time(
retail_forecast <- retail %>%
as_tibble() %>%
future_map(~ts(., frequency = 12)) %>%
future_map_dfc(~ forecast(., h = h)$mean)
)
# user system elapsed
# 0.172 0.080 14.702
#
由 reprex package (v0.2.0.9000) 创建于 2018-08-01。