在 python 中估计模型的多个参数
Estimating multiple parameters of a model in python
想知道在模型中用 Python 估计这些参数 (a0, a1, a2, a3)
的最 efficient/accurate 方法是什么:
col_4 = a0 + a1*col_1 + a2*col_2 + a3*col_3
示例数据集为:
inputs = {
'col_1': np.random.normal(15,2,100),
'col_2': np.random.normal(15,1,100),
'col_3': np.random.normal(0.9,1,100),
'col_4': np.random.normal(-0.05,0.5,100),
}
_idx = pd.date_range('2021-01-01','2021-04-10',freq='D').to_series()
data = pd.DataFrame(inputs, index = _idx)
statsmodels
提供了一种非常简单的方法来估计这样的线性模型:
import statsmodels.formula.api as smf
results = smf.ols('col_4 ~ col_1 + col_2 + col_3', data=data).fit()
print(results.summary())
coef
列显示您的 aX
参数:
OLS Regression Results
==============================================================================
Dep. Variable: col_4 R-squared: 0.049
Model: OLS Adj. R-squared: 0.019
Method: Least Squares F-statistic: 1.637
Date: Wed, 29 Dec 2021 Prob (F-statistic): 0.186
Time: 17:25:00 Log-Likelihood: -68.490
No. Observations: 100 AIC: 145.0
Df Residuals: 96 BIC: 155.4
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.2191 0.846 0.259 0.796 -1.461 1.899
col_1 -0.0198 0.023 -0.854 0.395 -0.066 0.026
col_2 -0.0048 0.051 -0.093 0.926 -0.107 0.097
col_3 0.1155 0.056 2.066 0.042 0.005 0.226
==============================================================================
Omnibus: 2.292 Durbin-Watson: 2.291
Prob(Omnibus): 0.318 Jarque-Bera (JB): 2.296
Skew: -0.351 Prob(JB): 0.317
Kurtosis: 2.757 Cond. No. 370.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
默认情况下包括拦截 (a0
)。如果要去掉,只需要在公式
中加一个-1
想知道在模型中用 Python 估计这些参数 (a0, a1, a2, a3)
的最 efficient/accurate 方法是什么:
col_4 = a0 + a1*col_1 + a2*col_2 + a3*col_3
示例数据集为:
inputs = {
'col_1': np.random.normal(15,2,100),
'col_2': np.random.normal(15,1,100),
'col_3': np.random.normal(0.9,1,100),
'col_4': np.random.normal(-0.05,0.5,100),
}
_idx = pd.date_range('2021-01-01','2021-04-10',freq='D').to_series()
data = pd.DataFrame(inputs, index = _idx)
statsmodels
提供了一种非常简单的方法来估计这样的线性模型:
import statsmodels.formula.api as smf
results = smf.ols('col_4 ~ col_1 + col_2 + col_3', data=data).fit()
print(results.summary())
coef
列显示您的 aX
参数:
OLS Regression Results
==============================================================================
Dep. Variable: col_4 R-squared: 0.049
Model: OLS Adj. R-squared: 0.019
Method: Least Squares F-statistic: 1.637
Date: Wed, 29 Dec 2021 Prob (F-statistic): 0.186
Time: 17:25:00 Log-Likelihood: -68.490
No. Observations: 100 AIC: 145.0
Df Residuals: 96 BIC: 155.4
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.2191 0.846 0.259 0.796 -1.461 1.899
col_1 -0.0198 0.023 -0.854 0.395 -0.066 0.026
col_2 -0.0048 0.051 -0.093 0.926 -0.107 0.097
col_3 0.1155 0.056 2.066 0.042 0.005 0.226
==============================================================================
Omnibus: 2.292 Durbin-Watson: 2.291
Prob(Omnibus): 0.318 Jarque-Bera (JB): 2.296
Skew: -0.351 Prob(JB): 0.317
Kurtosis: 2.757 Cond. No. 370.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
默认情况下包括拦截 (a0
)。如果要去掉,只需要在公式
-1