为什么 statsmodels 的 ARIMA(1,0,0) 不等同于 AutoReg(1)？

Question

我正在比较 arima_model 和 ar_model 的结果。以下是我无法理解的内容：

为什么得到的系数不同？是因为估算方法吗？（fit() 方法属性的不同设置不会给出相同的结果）
在获得系数并对拟合结果进行回测后，我匹配了 AR(1) 的结果，但不匹配 ARIMA(1) 的结果。为什么？
ARIMA 在这个最简单的设置中到底在做什么，它不是应该能够重现 AR 吗？

    import pandas_datareader as pdr
    import datetime 
    aapl = pdr.get_data_yahoo('AAPL', start=datetime.datetime(2006,1,1), end=datetime.datetime(2020,6,30))
    
    aapl = aapl.resample('M').mean()
    aapl['close_pct_change'] = aapl['Close'].pct_change()
    
    from statsmodels.tsa.arima_model import ARIMA
    mod = ARIMA(aapl['close_pct_change'][1:], order=(1,0,0))
    res1 = mod.fit(method='mle')
    print(res1.summary())
    
    from statsmodels.tsa.ar_model import AutoReg, ar_select_order
    mod = AutoReg(aapl['close_pct_change'][1:], 1)
    res2 = mod.fit()
    print(res2.summary())
    
    fitted_check1 = res1.params[0] + res1.params[1]*aapl['close_pct_change'][1:].shift(1)
    print(fitted_check1[1:] - res1.fittedvalues)
    
    fitted_check2 = res2.params[0] + res2.params[1]*aapl['close_pct_change'][1:].shift(1)
    print(fitted_check2[1:] - res2.fittedvalues)

Answer 1

Why are the resulting coefficients different? Is it because of the estimation method? (Different settings of the method property of fit() don't give identical results)

AutoReg 使用 OLS 估计参数，OLS 是有条件的（在第一次观察时）最大似然。 ARIMA 实现完全最大似然，因此在估计参数时使用第一次观察中的可用信息。在非常大的样本中，系数应该非常接近，并且它们的渐近极限相等。在实践中，它们总是不同的，尽管差异通常很小。

After getting the coefficients and backtesting the fitted results I match those of the AR(1) but not of ARIMA(1). Why?

两个模型使用不同的表示法。 AutoReg(1) 的型号是 Y(t) = a + b Y(t-1) + eps(t)。 ARIMA(1,0,0) 指定为 (Y(t) - c) = b * (Y(t-1) - c) + eps(t)。如果 |b|<1，那么在大样本限制下 c = a / (1-b)，尽管在有限样本中这个恒等式将不完全成立。

What is ARIMA really doing in this simplest setting, isnt it supposed to be able to reproduce AR?

没有。 ARIMA 使用 statsmodels Statespace 框架，该框架可以使用高斯 MLE 估计各种模型。

ARIMA 本质上是 SARIMAX 和 this notebook provides a good introduction.

的特例

为什么 statsmodels 的 ARIMA(1,0,0) 不等同于 AutoReg(1)？

Why statsmodels' ARIMA(1,0,0) is not equivalent to AutoReg(1)?

python

statistics

time-series

statsmodels