如何使 intertools arima grid-search 运行 更快
How to make intertools arima grid-search run faster
我有一个带有 intertools
的 ARIMA 网格搜索函数,我相信它可以为我提供 400 个数据集的 time-series
数据的最佳 ARIMA 模型,但它已经 运行ning 72 小时实际出结果但太慢了。我怎样才能在几分钟内非常非常快地达到 运行?
我试过 d=0, p=q=range(0,3)
代替 p = d = q = range(0, 3)
,这给了我错误。
# Generate the `ts` data
import numpy as np
import pandas as pd
from datetime import datetime
date_rng = pd.date_range('1985-01', periods=400, freq='M')
ts = pd.DataFrame(date_rng, columns=['date'])
ts['data'] = np.random.randint(0,100,size=(len(date_rng)))
ts.head(5)
.
import warnings
import itertools
warnings.filterwarnings("ignore") # specify to ignore warning messages
# Define the p, d and q parameters to take any value between 0 and 2
p = d = q = range(0, 3)
# Generate all different combinations of p, q and q triplets
pdq = list(itertools.product(p, d, q))
# Generate all different combinations of seasonal p, q and q triplets
seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]
print('Examples of parameter combinations for Seasonal ARIMA...')
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[1]))
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[2]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[3]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[4]))
.
warnings.filterwarnings("ignore") # specify to ignore warning messages
for param in pdq:
for param_seasonal in seasonal_pdq:
try:
mod = sm.tsa.statespace.SARIMAX(ts,
order=param,
seasonal_order=param_seasonal,
enforce_stationarity=False,
enforce_invertibility=False)
results = mod.fit()
print('ARIMA{}x{}12 - AIC:{}'.format(param, param_seasonal, results.aic))
except:
continue
我想在几分钟内非常非常快地编写 python 代码
又是我。所以,我试图找到另一种更快的方法,也许解决方案可以使用多处理。我今天尝试了,但我发现了一些小问题,我 post it 有人可以帮助我们。
顺便说一句。同时搜索,我发现了另一种更快的方法,所以如果你有兴趣可以我们这个代码
from pmdarima.arima import auto_arima
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
series = pd.read_csv('dataset.csv', header=None, index_col=0, parse_dates=True, squeeze=True)
train, test = series[1:900], series[900:]
Arima_model=auto_arima(train, start_p=1, start_q=1, max_p=8, max_q=8, start_P=0, start_Q=0, max_P=8, max_Q=8, m=12, seasonal=True, trace=True, d=1, D=1, error_action='warn', suppress_warnings=True, random_state = 20, n_fits=30)
我有一个带有 intertools
的 ARIMA 网格搜索函数,我相信它可以为我提供 400 个数据集的 time-series
数据的最佳 ARIMA 模型,但它已经 运行ning 72 小时实际出结果但太慢了。我怎样才能在几分钟内非常非常快地达到 运行?
我试过 d=0, p=q=range(0,3)
代替 p = d = q = range(0, 3)
,这给了我错误。
# Generate the `ts` data
import numpy as np
import pandas as pd
from datetime import datetime
date_rng = pd.date_range('1985-01', periods=400, freq='M')
ts = pd.DataFrame(date_rng, columns=['date'])
ts['data'] = np.random.randint(0,100,size=(len(date_rng)))
ts.head(5)
.
import warnings
import itertools
warnings.filterwarnings("ignore") # specify to ignore warning messages
# Define the p, d and q parameters to take any value between 0 and 2
p = d = q = range(0, 3)
# Generate all different combinations of p, q and q triplets
pdq = list(itertools.product(p, d, q))
# Generate all different combinations of seasonal p, q and q triplets
seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]
print('Examples of parameter combinations for Seasonal ARIMA...')
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[1]))
print('SARIMAX: {} x {}'.format(pdq[1], seasonal_pdq[2]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[3]))
print('SARIMAX: {} x {}'.format(pdq[2], seasonal_pdq[4]))
.
warnings.filterwarnings("ignore") # specify to ignore warning messages
for param in pdq:
for param_seasonal in seasonal_pdq:
try:
mod = sm.tsa.statespace.SARIMAX(ts,
order=param,
seasonal_order=param_seasonal,
enforce_stationarity=False,
enforce_invertibility=False)
results = mod.fit()
print('ARIMA{}x{}12 - AIC:{}'.format(param, param_seasonal, results.aic))
except:
continue
我想在几分钟内非常非常快地编写 python 代码
又是我。所以,我试图找到另一种更快的方法,也许解决方案可以使用多处理。我今天尝试了,但我发现了一些小问题,我 post it 有人可以帮助我们。
顺便说一句。同时搜索,我发现了另一种更快的方法,所以如果你有兴趣可以我们这个代码
from pmdarima.arima import auto_arima
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
series = pd.read_csv('dataset.csv', header=None, index_col=0, parse_dates=True, squeeze=True)
train, test = series[1:900], series[900:]
Arima_model=auto_arima(train, start_p=1, start_q=1, max_p=8, max_q=8, start_P=0, start_Q=0, max_P=8, max_Q=8, m=12, seasonal=True, trace=True, d=1, D=1, error_action='warn', suppress_warnings=True, random_state = 20, n_fits=30)