如何反转 Python statsmodels ARIMA 预测中的差异?
How to invert differencing in a Python statsmodels ARIMA forecast?
我正在尝试使用 Python 和 Statsmodels 来研究 ARIMA 预测。具体来说,要使 ARIMA 算法起作用,需要通过差分(或类似方法)使数据静止。问题是:在做出残差预测后,如何反转差分以返回包含差分的趋势和季节性的预测?
(我看到了一个类似的问题here但是,唉,没有发布答案。)
这是我到目前为止所做的(基于掌握Python数据分析最后一章的示例,Magnus Vilhelm Persson;Luiz Felipe Martins) .数据来自DataMarket.
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
from statsmodels import tsa
from statsmodels.tsa import stattools as stt
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.arima_model import ARIMA
def is_stationary(df, maxlag=15, autolag=None, regression='ct'):
"""Test if df is stationary using Augmented
Dickey Fuller"""
adf_test = stt.adfuller(df,maxlag=maxlag, autolag=autolag, regression=regression)
adf = adf_test[0]
cv_5 = adf_test[4]["5%"]
result = adf < cv_5
return result
def d_param(df, max_lag=12):
d = 0
for i in range(1, max_lag):
if is_stationary(df.diff(i).dropna()):
d = i
break;
return d
def ARMA_params(df):
p, q = tsa.stattools.arma_order_select_ic(df.dropna(),ic='aic').aic_min_order
return p, q
# read data
carsales = pd.read_csv('data/monthly-car-sales-in-quebec-1960.csv',
parse_dates=['Month'],
index_col='Month',
date_parser=lambda d:pd.datetime.strptime(d, '%Y-%m'))
carsales = carsales.iloc[:,0]
# get components
carsales_decomp = seasonal_decompose(carsales, freq=12)
residuals = carsales - carsales_decomp.seasonal - carsales_decomp.trend
residuals = residuals.dropna()
# fit model
d = d_param(carsales, max_lag=12)
p, q = ARMA_params(residuals)
model = ARIMA(residuals, order=(p, d, q))
model_fit = model.fit()
# plot prediction
model_fit.plot_predict(start='1961-12-01', end='1970-01-01', alpha=0.10)
plt.legend(loc='upper left')
plt.xlabel('Year')
plt.ylabel('Sales')
plt.title('Residuals 1960-1970')
print(arimares.aic, arimares.bic)
结果图令人满意,但不包括趋势、季节性信息。如何反转差分以重新捕获 trend/seasonality? Residual plot
当时间趋势(或多个)可能是更好的策略时,依靠差分。第 33 期是异常值,如果您忽略它,则会产生后果。
PACF 没有表现出强烈的季节性成分。
与3月、4月、5月、6月相关性强的弱季节性AR。
我正在尝试使用 Python 和 Statsmodels 来研究 ARIMA 预测。具体来说,要使 ARIMA 算法起作用,需要通过差分(或类似方法)使数据静止。问题是:在做出残差预测后,如何反转差分以返回包含差分的趋势和季节性的预测?
(我看到了一个类似的问题here但是,唉,没有发布答案。)
这是我到目前为止所做的(基于掌握Python数据分析最后一章的示例,Magnus Vilhelm Persson;Luiz Felipe Martins) .数据来自DataMarket.
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
from statsmodels import tsa
from statsmodels.tsa import stattools as stt
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.arima_model import ARIMA
def is_stationary(df, maxlag=15, autolag=None, regression='ct'):
"""Test if df is stationary using Augmented
Dickey Fuller"""
adf_test = stt.adfuller(df,maxlag=maxlag, autolag=autolag, regression=regression)
adf = adf_test[0]
cv_5 = adf_test[4]["5%"]
result = adf < cv_5
return result
def d_param(df, max_lag=12):
d = 0
for i in range(1, max_lag):
if is_stationary(df.diff(i).dropna()):
d = i
break;
return d
def ARMA_params(df):
p, q = tsa.stattools.arma_order_select_ic(df.dropna(),ic='aic').aic_min_order
return p, q
# read data
carsales = pd.read_csv('data/monthly-car-sales-in-quebec-1960.csv',
parse_dates=['Month'],
index_col='Month',
date_parser=lambda d:pd.datetime.strptime(d, '%Y-%m'))
carsales = carsales.iloc[:,0]
# get components
carsales_decomp = seasonal_decompose(carsales, freq=12)
residuals = carsales - carsales_decomp.seasonal - carsales_decomp.trend
residuals = residuals.dropna()
# fit model
d = d_param(carsales, max_lag=12)
p, q = ARMA_params(residuals)
model = ARIMA(residuals, order=(p, d, q))
model_fit = model.fit()
# plot prediction
model_fit.plot_predict(start='1961-12-01', end='1970-01-01', alpha=0.10)
plt.legend(loc='upper left')
plt.xlabel('Year')
plt.ylabel('Sales')
plt.title('Residuals 1960-1970')
print(arimares.aic, arimares.bic)
结果图令人满意,但不包括趋势、季节性信息。如何反转差分以重新捕获 trend/seasonality? Residual plot
当时间趋势(或多个)可能是更好的策略时,依靠差分。第 33 期是异常值,如果您忽略它,则会产生后果。
PACF 没有表现出强烈的季节性成分。
与3月、4月、5月、6月相关性强的弱季节性AR。