模型的“f_pvalue”对应于什么检验(零假设)?
What test (null hypothesis) does a model's `f_pvalue` correspond to?
OLSResults
的 f_pvalue
属性背后的原假设是什么? This docstring 不是特别有用。
起初我以为零假设是所有估计系数同时为零(包括常数项)。然而,我开始认为被检验的假设是所有估计参数 除了常数项 同时为零(即 b1 = b2 = ... = bp = 0,不包括 b0).
例如,假设 y
是目标数组,X
是特征的 numpy 矩阵(常数项和 p 特征)。
# Silly example
from statsmodels.api import OLS
m = OLS(endog=y, exog=X).fit()
# What is being tested here?
print(m.f_pvalue)
有人知道原假设是什么吗?
感谢@Josef 清理问题。根据 the documentation:
F-statistic of the fully specified model.
Calculated as the mean squared error of the model divided by the mean squared error of the residuals if the nonrobust covariance is used. Otherwise computed using a Wald-like quadratic form that tests whether all coefficients (excluding the constant) are zero.
并且只是为了证明是这样的:
# Libraries
import numpy as np
import pandas as pd
from statsmodels.api import OLS
from sklearn.datasets import load_boston
# Load target
y = pd.DataFrame(load_boston()['target'], columns=['price'])
# Load features
X = pd.DataFrame(load_boston()['data'], columns=load_boston()['feature_names'])
# Add constant
X['CONST'] = 1
# One feature
m1 = OLS(endog=y, exog=X[['CONST','CRIM']]).fit()
print(f'm1 pvalue: {m1.f_pvalue}')
# Multiple features
m2 = OLS(endog=y, exog=X[['CONST','CRIM','AGE']]).fit()
print(f'm2 pvalue: {m2.f_pvalue}')
# Manually test H0: all coefficients are zero (excluding b0)
print('Manual F-test for m1', m1.f_test(r_matrix=np.matrix([[0,0],[0,1]])),
'Manual F-test for m2', m2.f_test(r_matrix=np.matrix([[0,0,0],[0,1,0],[0,0,1]])),
sep='\n')
# Output
"""
> m1 pvalue: 1.1739870821944483e-19
> m2 pvalue: 2.2015246345918656e-27
> Manual F-test for m1
> <F test: F=array([[89.48611476]]), p=1.1739870821945733e-19, df_denom=504, df_num=1>
> Manual F-test for m2
> <F test: F=array([[69.51929476]]), p=2.2015246345920063e-27, df_denom=503, > df_num=2>
"""
所以是的,f_pvalue
匹配手动输入原假设的 p-value。
OLSResults
的 f_pvalue
属性背后的原假设是什么? This docstring 不是特别有用。
起初我以为零假设是所有估计系数同时为零(包括常数项)。然而,我开始认为被检验的假设是所有估计参数 除了常数项 同时为零(即 b1 = b2 = ... = bp = 0,不包括 b0).
例如,假设 y
是目标数组,X
是特征的 numpy 矩阵(常数项和 p 特征)。
# Silly example
from statsmodels.api import OLS
m = OLS(endog=y, exog=X).fit()
# What is being tested here?
print(m.f_pvalue)
有人知道原假设是什么吗?
感谢@Josef 清理问题。根据 the documentation:
F-statistic of the fully specified model.
Calculated as the mean squared error of the model divided by the mean squared error of the residuals if the nonrobust covariance is used. Otherwise computed using a Wald-like quadratic form that tests whether all coefficients (excluding the constant) are zero.
并且只是为了证明是这样的:
# Libraries
import numpy as np
import pandas as pd
from statsmodels.api import OLS
from sklearn.datasets import load_boston
# Load target
y = pd.DataFrame(load_boston()['target'], columns=['price'])
# Load features
X = pd.DataFrame(load_boston()['data'], columns=load_boston()['feature_names'])
# Add constant
X['CONST'] = 1
# One feature
m1 = OLS(endog=y, exog=X[['CONST','CRIM']]).fit()
print(f'm1 pvalue: {m1.f_pvalue}')
# Multiple features
m2 = OLS(endog=y, exog=X[['CONST','CRIM','AGE']]).fit()
print(f'm2 pvalue: {m2.f_pvalue}')
# Manually test H0: all coefficients are zero (excluding b0)
print('Manual F-test for m1', m1.f_test(r_matrix=np.matrix([[0,0],[0,1]])),
'Manual F-test for m2', m2.f_test(r_matrix=np.matrix([[0,0,0],[0,1,0],[0,0,1]])),
sep='\n')
# Output
"""
> m1 pvalue: 1.1739870821944483e-19
> m2 pvalue: 2.2015246345918656e-27
> Manual F-test for m1
> <F test: F=array([[89.48611476]]), p=1.1739870821945733e-19, df_denom=504, df_num=1>
> Manual F-test for m2
> <F test: F=array([[69.51929476]]), p=2.2015246345920063e-27, df_denom=503, > df_num=2>
"""
所以是的,f_pvalue
匹配手动输入原假设的 p-value。