Statsmodels:我如何 select 回归的不同置信区间
Statsmodels: How can I select different confidence intervals for regressions
我想 运行 使用 statsmodels 进行置信区间为 99% 而不是默认的 95% 的回归。
如果 fit() 方法中有参数,我查看了文档,但我没有注意到什么。我也尝试了 conf_int 方法,但我对输出感到困惑。
import pandas as pd
import math
import statsmodels.formula.api as sm
df = pd.read_excel(r'C:\TestData.xlsx')
df['LogBalance'] = df['Balance'].map(lambda x: math.log(x))
est = sm.ols(formula= 'LogBalance ~ N + Rate',
data=df).fit(cov_type='HAC',cov_kwds={'maxlags':1})
print(est.summary())
print(est.conf_int(alpha=0.01, cols=None))
由于我是 Python 的新手,您能否告诉我是否以及如何在初始回归输出中使用调整后的置信区间在 statsmodels 中执行回归?
谢谢
您可以在 .summary()
中指定置信区间 directly 请考虑以下示例:
import statsmodels.formula.api as smf
import seaborn as sns
# load a sample dataset
df = sns.load_dataset('tips')
# run model
formula = 'tip ~ size + total_bill'
results = smf.ols(formula=formula, data=df).fit()
# use 95 % CI (default setting)
print(results.summary())
OLS Regression Results
==============================================================================
Dep. Variable: tip R-squared: 0.468
Model: OLS Adj. R-squared: 0.463
Method: Least Squares F-statistic: 105.9
Date: Fri, 21 Jun 2019 Prob (F-statistic): 9.67e-34
Time: 21:42:09 Log-Likelihood: -347.99
No. Observations: 244 AIC: 702.0
Df Residuals: 241 BIC: 712.5
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.6689 0.194 3.455 0.001 0.288 1.050
size 0.1926 0.085 2.258 0.025 0.025 0.361
total_bill 0.0927 0.009 10.172 0.000 0.075 0.111
==============================================================================
Omnibus: 24.753 Durbin-Watson: 2.100
Prob(Omnibus): 0.000 Jarque-Bera (JB): 46.169
Skew: 0.545 Prob(JB): 9.43e-11
Kurtosis: 4.831 Cond. No. 67.6
==============================================================================
# use 99 % CI
print(results.summary(alpha=0.01))
OLS Regression Results
==============================================================================
Dep. Variable: tip R-squared: 0.468
Model: OLS Adj. R-squared: 0.463
Method: Least Squares F-statistic: 105.9
Date: Fri, 21 Jun 2019 Prob (F-statistic): 9.67e-34
Time: 21:45:57 Log-Likelihood: -347.99
No. Observations: 244 AIC: 702.0
Df Residuals: 241 BIC: 712.5
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.005 0.995]
------------------------------------------------------------------------------
Intercept 0.6689 0.194 3.455 0.001 0.166 1.172
size 0.1926 0.085 2.258 0.025 -0.029 0.414
total_bill 0.0927 0.009 10.172 0.000 0.069 0.116
==============================================================================
Omnibus: 24.753 Durbin-Watson: 2.100
Prob(Omnibus): 0.000 Jarque-Bera (JB): 46.169
Skew: 0.545 Prob(JB): 9.43e-11
Kurtosis: 4.831 Cond. No. 67.6
==============================================================================
我想 运行 使用 statsmodels 进行置信区间为 99% 而不是默认的 95% 的回归。
如果 fit() 方法中有参数,我查看了文档,但我没有注意到什么。我也尝试了 conf_int 方法,但我对输出感到困惑。
import pandas as pd
import math
import statsmodels.formula.api as sm
df = pd.read_excel(r'C:\TestData.xlsx')
df['LogBalance'] = df['Balance'].map(lambda x: math.log(x))
est = sm.ols(formula= 'LogBalance ~ N + Rate',
data=df).fit(cov_type='HAC',cov_kwds={'maxlags':1})
print(est.summary())
print(est.conf_int(alpha=0.01, cols=None))
由于我是 Python 的新手,您能否告诉我是否以及如何在初始回归输出中使用调整后的置信区间在 statsmodels 中执行回归?
谢谢
您可以在 .summary()
中指定置信区间 directly 请考虑以下示例:
import statsmodels.formula.api as smf
import seaborn as sns
# load a sample dataset
df = sns.load_dataset('tips')
# run model
formula = 'tip ~ size + total_bill'
results = smf.ols(formula=formula, data=df).fit()
# use 95 % CI (default setting)
print(results.summary())
OLS Regression Results
==============================================================================
Dep. Variable: tip R-squared: 0.468
Model: OLS Adj. R-squared: 0.463
Method: Least Squares F-statistic: 105.9
Date: Fri, 21 Jun 2019 Prob (F-statistic): 9.67e-34
Time: 21:42:09 Log-Likelihood: -347.99
No. Observations: 244 AIC: 702.0
Df Residuals: 241 BIC: 712.5
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.6689 0.194 3.455 0.001 0.288 1.050
size 0.1926 0.085 2.258 0.025 0.025 0.361
total_bill 0.0927 0.009 10.172 0.000 0.075 0.111
==============================================================================
Omnibus: 24.753 Durbin-Watson: 2.100
Prob(Omnibus): 0.000 Jarque-Bera (JB): 46.169
Skew: 0.545 Prob(JB): 9.43e-11
Kurtosis: 4.831 Cond. No. 67.6
==============================================================================
# use 99 % CI
print(results.summary(alpha=0.01))
OLS Regression Results
==============================================================================
Dep. Variable: tip R-squared: 0.468
Model: OLS Adj. R-squared: 0.463
Method: Least Squares F-statistic: 105.9
Date: Fri, 21 Jun 2019 Prob (F-statistic): 9.67e-34
Time: 21:45:57 Log-Likelihood: -347.99
No. Observations: 244 AIC: 702.0
Df Residuals: 241 BIC: 712.5
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.005 0.995]
------------------------------------------------------------------------------
Intercept 0.6689 0.194 3.455 0.001 0.166 1.172
size 0.1926 0.085 2.258 0.025 -0.029 0.414
total_bill 0.0927 0.009 10.172 0.000 0.069 0.116
==============================================================================
Omnibus: 24.753 Durbin-Watson: 2.100
Prob(Omnibus): 0.000 Jarque-Bera (JB): 46.169
Skew: 0.545 Prob(JB): 9.43e-11
Kurtosis: 4.831 Cond. No. 67.6
==============================================================================