如何重现 statsmodels ARIMA 过滤器?
How can I reproduce the statsmodels ARIMA filter?
我正在尝试使用 stastmodels recursive_filter
和 convolution_filter
重现 ARIMA 模型中使用的过滤器。 (我的目的 objective 是使用这些过滤器对外生序列进行预白化。)
我首先使用 AR 模型和递归过滤器。这是简化的实验设置:
import numpy as np
import statsmodels as sm
np.random.seed(42)
# sample data
series = sm.tsa.arima_process.arma_generate_sample(ar=(1,-0.2,-0.5), ma=(1,), nsample=100)
model = sm.tsa.arima.model.ARIMA(series, order=(2,0,0)).fit()
print(model.summary())
优雅地生成以下内容,这似乎很公平:
SARIMAX Results
==============================================================================
Dep. Variable: y No. Observations: 100
Model: ARIMA(2, 0, 0) Log Likelihood -131.991
Date: Wed, 07 Apr 2021 AIC 271.982
Time: 12:58:39 BIC 282.403
Sample: 0 HQIC 276.200
- 100
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -0.3136 0.266 -1.179 0.238 -0.835 0.208
ar.L1 0.2135 0.084 2.550 0.011 0.049 0.378
ar.L2 0.4467 0.101 4.427 0.000 0.249 0.645
sigma2 0.8154 0.126 6.482 0.000 0.569 1.062
===================================================================================
Ljung-Box (L1) (Q): 0.10 Jarque-Bera (JB): 0.53
Prob(Q): 0.75 Prob(JB): 0.77
Heteroskedasticity (H): 0.98 Skew: -0.16
Prob(H) (two-sided): 0.96 Kurtosis: 2.85
===================================================================================
我拟合 AR(2) 并根据 SARIMAX 结果获得滞后 1 和滞后 2 的系数。我使用 statsmodels.tsa.filters.filtertools.recursive_filter
重现此模型的直觉是这样的:
filtered = sm.tsa.filters.filtertools.recursive_filter(series, ar_coeff=(-0.2135, -0.4467))
(也可能从回归结果中加入常量)。然而,直接比较结果表明递归过滤器没有复制 AR 模型:
import matploylib.pyplot as plt
# ARIMA residuals
plt.plot(model.resid)
# Calculated residuals using recursive filter outcome
plt.plot(filtered)
我是不是处理错了?我应该使用不同的过滤器功能吗?我的下一步是在 MA 模型上执行相同的任务,以便我可以将(?)结果加在一起以获得用于预白化的完整 ARMA 过滤器。
Note: this question may be valuable to somebody searching for "how can I prewhiten timeseries data?" particularly in Python using statsmodels.
我想您应该对 AR 部分使用 convolution_filter
,对 MA 部分使用 recursive_filter
。按顺序组合这些将适用于 ARMA 模型。或者,您可以使用 arma_innovations
作为同时适用于 AR 和 MA 部分的精确方法。以下是一些示例:
import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.tsa.innovations import arma_innovations
AR(2)
np.random.seed(42)
series = sm.tsa.arma_generate_sample(ar=(1, -0.2, -0.5), ma=(1,), nsample=100)
res = sm.tsa.arima.ARIMA(series, order=(2, 0, 0), trend='n').fit()
print(pd.DataFrame({
'ARIMA resid': res.resid,
'arma_innovations': arma_innovations.arma_innovations(
series, ar_params=res.params[:-1])[0],
'convolution filter': sm.tsa.filters.convolution_filter(
series, np.r_[1, -res.params[:-1]], nsides=1)}))
给出:
ARIMA resid arma_innovations convolution filter
0 0.496714 0.496714 NaN
1 -0.254235 -0.254235 NaN
2 0.666326 0.666326 0.666326
3 1.493315 1.493315 1.493315
4 -0.256708 -0.256708 -0.256708
.. ... ... ...
95 -1.438670 -1.438670 -1.438670
96 0.323470 0.323470 0.323470
97 0.218243 0.218243 0.218243
98 0.012264 0.012264 0.012264
99 -0.245584 -0.245584 -0.245584
MA(1)
np.random.seed(42)
series = sm.tsa.arma_generate_sample(ar=(1,), ma=(1, 0.2), nsample=100)
res = sm.tsa.arima.ARIMA(series, order=(0, 0, 1), trend='n').fit()
print(pd.DataFrame({
'ARIMA resid': res.resid,
'arma_innovations': arma_innovations.arma_innovations(
series, ma_params=res.params[:-1])[0],
'convolution filter': sm.tsa.filters.recursive_filter(series, -res.params[:-1])}))
给出:
ARIMA resid arma_innovations recursive filter
0 0.496714 0.496714 0.496714
1 -0.132893 -0.132893 -0.136521
2 0.646110 0.646110 0.646861
3 1.525620 1.525620 1.525466
4 -0.229316 -0.229316 -0.229286
.. ... ... ...
95 -1.464786 -1.464786 -1.464786
96 0.291233 0.291233 0.291233
97 0.263055 0.263055 0.263055
98 0.005637 0.005637 0.005637
99 -0.234672 -0.234672 -0.234672
ARMA(1, 1)
np.random.seed(42)
series = sm.tsa.arma_generate_sample(ar=(1, 0.5), ma=(1, 0.2), nsample=100)
res = sm.tsa.arima.ARIMA(series, order=(1, 0, 1), trend='n').fit()
a = res.resid
# Apply the recursive then convolution filter
tmp = sm.tsa.filters.recursive_filter(series, -res.params[1:2])
filtered = sm.tsa.filters.convolution_filter(tmp, np.r_[1, -res.params[:1]], nsides=1)
print(pd.DataFrame({
'ARIMA resid': res.resid,
'arma_innovations': arma_innovations.arma_innovations(
series, ar_params=res.params[:1], ma_params=res.params[1:2])[0],
'combined filters': filtered}))
给出:
ARIMA resid arma_innovations combined filters
0 0.496714 0.496714 NaN
1 -0.134253 -0.134253 -0.136915
2 0.668094 0.668094 0.668246
3 1.507288 1.507288 1.507279
4 -0.193560 -0.193560 -0.193559
.. ... ... ...
95 -1.448784 -1.448784 -1.448784
96 0.268421 0.268421 0.268421
97 0.212966 0.212966 0.212966
98 0.046281 0.046281 0.046281
99 -0.244725 -0.244725 -0.244725
SARIMA(1, 0, 1)x(1, 0, 0, 3)
季节性模型稍微复杂一些,因为它需要乘以滞后多项式。有关其他详细信息,请参阅 Statsmodels 文档中的 example notebook。
np.random.seed(42)
ar_poly = [1, -0.5]
sar_poly = [1, 0, 0, -0.1]
ar = np.polymul(ar_poly, sar_poly)
series = sm.tsa.arma_generate_sample(ar=ar, ma=(1, 0.2), nsample=100)
res = sm.tsa.arima.ARIMA(series, order=(1, 0, 1), seasonal_order=(1, 0, 0, 3), trend='n').fit()
a = res.resid
# Apply the recursive then convolution filter
tmp = sm.tsa.filters.recursive_filter(series, -res.polynomial_reduced_ma[1:])
filtered = sm.tsa.filters.convolution_filter(tmp, res.polynomial_reduced_ar, nsides=1)
print(pd.DataFrame({
'ARIMA resid': res.resid,
'arma_innovations': arma_innovations.arma_innovations(
series, ar_params=-res.polynomial_reduced_ar[1:],
ma_params=res.polynomial_reduced_ma[1:])[0],
'combined filters': filtered}))
给出:
ARIMA resid arma_innovations combined filters
0 0.496714 0.496714 NaN
1 -0.100303 -0.100303 NaN
2 0.625066 0.625066 NaN
3 1.557418 1.557418 NaN
4 -0.209256 -0.209256 -0.205201
.. ... ... ...
95 -1.476702 -1.476702 -1.476702
96 0.269118 0.269118 0.269118
97 0.230697 0.230697 0.230697
98 -0.004561 -0.004561 -0.004561
99 -0.233466 -0.233466 -0.233466
我正在尝试使用 stastmodels recursive_filter
和 convolution_filter
重现 ARIMA 模型中使用的过滤器。 (我的目的 objective 是使用这些过滤器对外生序列进行预白化。)
我首先使用 AR 模型和递归过滤器。这是简化的实验设置:
import numpy as np
import statsmodels as sm
np.random.seed(42)
# sample data
series = sm.tsa.arima_process.arma_generate_sample(ar=(1,-0.2,-0.5), ma=(1,), nsample=100)
model = sm.tsa.arima.model.ARIMA(series, order=(2,0,0)).fit()
print(model.summary())
优雅地生成以下内容,这似乎很公平:
SARIMAX Results
==============================================================================
Dep. Variable: y No. Observations: 100
Model: ARIMA(2, 0, 0) Log Likelihood -131.991
Date: Wed, 07 Apr 2021 AIC 271.982
Time: 12:58:39 BIC 282.403
Sample: 0 HQIC 276.200
- 100
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -0.3136 0.266 -1.179 0.238 -0.835 0.208
ar.L1 0.2135 0.084 2.550 0.011 0.049 0.378
ar.L2 0.4467 0.101 4.427 0.000 0.249 0.645
sigma2 0.8154 0.126 6.482 0.000 0.569 1.062
===================================================================================
Ljung-Box (L1) (Q): 0.10 Jarque-Bera (JB): 0.53
Prob(Q): 0.75 Prob(JB): 0.77
Heteroskedasticity (H): 0.98 Skew: -0.16
Prob(H) (two-sided): 0.96 Kurtosis: 2.85
===================================================================================
我拟合 AR(2) 并根据 SARIMAX 结果获得滞后 1 和滞后 2 的系数。我使用 statsmodels.tsa.filters.filtertools.recursive_filter
重现此模型的直觉是这样的:
filtered = sm.tsa.filters.filtertools.recursive_filter(series, ar_coeff=(-0.2135, -0.4467))
(也可能从回归结果中加入常量)。然而,直接比较结果表明递归过滤器没有复制 AR 模型:
import matploylib.pyplot as plt
# ARIMA residuals
plt.plot(model.resid)
# Calculated residuals using recursive filter outcome
plt.plot(filtered)
我是不是处理错了?我应该使用不同的过滤器功能吗?我的下一步是在 MA 模型上执行相同的任务,以便我可以将(?)结果加在一起以获得用于预白化的完整 ARMA 过滤器。
Note: this question may be valuable to somebody searching for "how can I prewhiten timeseries data?" particularly in Python using statsmodels.
我想您应该对 AR 部分使用 convolution_filter
,对 MA 部分使用 recursive_filter
。按顺序组合这些将适用于 ARMA 模型。或者,您可以使用 arma_innovations
作为同时适用于 AR 和 MA 部分的精确方法。以下是一些示例:
import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.tsa.innovations import arma_innovations
AR(2)
np.random.seed(42)
series = sm.tsa.arma_generate_sample(ar=(1, -0.2, -0.5), ma=(1,), nsample=100)
res = sm.tsa.arima.ARIMA(series, order=(2, 0, 0), trend='n').fit()
print(pd.DataFrame({
'ARIMA resid': res.resid,
'arma_innovations': arma_innovations.arma_innovations(
series, ar_params=res.params[:-1])[0],
'convolution filter': sm.tsa.filters.convolution_filter(
series, np.r_[1, -res.params[:-1]], nsides=1)}))
给出:
ARIMA resid arma_innovations convolution filter
0 0.496714 0.496714 NaN
1 -0.254235 -0.254235 NaN
2 0.666326 0.666326 0.666326
3 1.493315 1.493315 1.493315
4 -0.256708 -0.256708 -0.256708
.. ... ... ...
95 -1.438670 -1.438670 -1.438670
96 0.323470 0.323470 0.323470
97 0.218243 0.218243 0.218243
98 0.012264 0.012264 0.012264
99 -0.245584 -0.245584 -0.245584
MA(1)
np.random.seed(42)
series = sm.tsa.arma_generate_sample(ar=(1,), ma=(1, 0.2), nsample=100)
res = sm.tsa.arima.ARIMA(series, order=(0, 0, 1), trend='n').fit()
print(pd.DataFrame({
'ARIMA resid': res.resid,
'arma_innovations': arma_innovations.arma_innovations(
series, ma_params=res.params[:-1])[0],
'convolution filter': sm.tsa.filters.recursive_filter(series, -res.params[:-1])}))
给出:
ARIMA resid arma_innovations recursive filter
0 0.496714 0.496714 0.496714
1 -0.132893 -0.132893 -0.136521
2 0.646110 0.646110 0.646861
3 1.525620 1.525620 1.525466
4 -0.229316 -0.229316 -0.229286
.. ... ... ...
95 -1.464786 -1.464786 -1.464786
96 0.291233 0.291233 0.291233
97 0.263055 0.263055 0.263055
98 0.005637 0.005637 0.005637
99 -0.234672 -0.234672 -0.234672
ARMA(1, 1)
np.random.seed(42)
series = sm.tsa.arma_generate_sample(ar=(1, 0.5), ma=(1, 0.2), nsample=100)
res = sm.tsa.arima.ARIMA(series, order=(1, 0, 1), trend='n').fit()
a = res.resid
# Apply the recursive then convolution filter
tmp = sm.tsa.filters.recursive_filter(series, -res.params[1:2])
filtered = sm.tsa.filters.convolution_filter(tmp, np.r_[1, -res.params[:1]], nsides=1)
print(pd.DataFrame({
'ARIMA resid': res.resid,
'arma_innovations': arma_innovations.arma_innovations(
series, ar_params=res.params[:1], ma_params=res.params[1:2])[0],
'combined filters': filtered}))
给出:
ARIMA resid arma_innovations combined filters
0 0.496714 0.496714 NaN
1 -0.134253 -0.134253 -0.136915
2 0.668094 0.668094 0.668246
3 1.507288 1.507288 1.507279
4 -0.193560 -0.193560 -0.193559
.. ... ... ...
95 -1.448784 -1.448784 -1.448784
96 0.268421 0.268421 0.268421
97 0.212966 0.212966 0.212966
98 0.046281 0.046281 0.046281
99 -0.244725 -0.244725 -0.244725
SARIMA(1, 0, 1)x(1, 0, 0, 3)
季节性模型稍微复杂一些,因为它需要乘以滞后多项式。有关其他详细信息,请参阅 Statsmodels 文档中的 example notebook。
np.random.seed(42)
ar_poly = [1, -0.5]
sar_poly = [1, 0, 0, -0.1]
ar = np.polymul(ar_poly, sar_poly)
series = sm.tsa.arma_generate_sample(ar=ar, ma=(1, 0.2), nsample=100)
res = sm.tsa.arima.ARIMA(series, order=(1, 0, 1), seasonal_order=(1, 0, 0, 3), trend='n').fit()
a = res.resid
# Apply the recursive then convolution filter
tmp = sm.tsa.filters.recursive_filter(series, -res.polynomial_reduced_ma[1:])
filtered = sm.tsa.filters.convolution_filter(tmp, res.polynomial_reduced_ar, nsides=1)
print(pd.DataFrame({
'ARIMA resid': res.resid,
'arma_innovations': arma_innovations.arma_innovations(
series, ar_params=-res.polynomial_reduced_ar[1:],
ma_params=res.polynomial_reduced_ma[1:])[0],
'combined filters': filtered}))
给出:
ARIMA resid arma_innovations combined filters
0 0.496714 0.496714 NaN
1 -0.100303 -0.100303 NaN
2 0.625066 0.625066 NaN
3 1.557418 1.557418 NaN
4 -0.209256 -0.209256 -0.205201
.. ... ... ...
95 -1.476702 -1.476702 -1.476702
96 0.269118 0.269118 0.269118
97 0.230697 0.230697 0.230697
98 -0.004561 -0.004561 -0.004561
99 -0.233466 -0.233466 -0.233466