python 中的单变量回归
univariate regression in python
需要 运行 多个单因素(单变量)回归模型 python 在数据框中的一列和同一数据框中的其他几个列之间
-
所以根据图像,我想运行 x1 & dep、x2 & dep 等等之间的回归模型
想要输出 - beta、截距、R-sq、p 值、SSE、AIC、BIC、残差的正态性检验等
您可以在此处使用两个选项。一个是流行的 scikit-learn 库。用法如下
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y) # where X is your feature data and y is your target
reg.score(X, y) # R^2 value
>>> 0.87
reg.coef_ # slope coeficients
>>> array([1.45, -9.2])
reg.intercept_ # intercept
>>> 6.1723...
没有多少其他统计信息可以与 scikit 一起使用。
另一个选项是 statsmodels,它提供了更丰富的模型统计信息细节
import numpy as np
import statsmodels.api as sm
# generate some synthetic data
nsample = 100
x = np.linspace(0, 10, 100)
X = np.column_stack((x, x**2))
beta = np.array([1, 0.1, 10])
e = np.random.normal(size=nsample)
X = sm.add_constant(X)
y = np.dot(X, beta) + e
# fit the model and get a summary of the statistics
model = sm.OLS(y, X)
results = model.fit()
print(results.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
Method: Least Squares F-statistic: 4.020e+06
Date: Mon, 08 Jul 2019 Prob (F-statistic): 2.83e-239
Time: 02:07:22 Log-Likelihood: -146.51
No. Observations: 100 AIC: 299.0
Df Residuals: 97 BIC: 306.8
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 1.3423 0.313 4.292 0.000 0.722 1.963
x1 -0.0402 0.145 -0.278 0.781 -0.327 0.247
x2 10.0103 0.014 715.745 0.000 9.982 10.038
==============================================================================
Omnibus: 2.042 Durbin-Watson: 2.274
Prob(Omnibus): 0.360 Jarque-Bera (JB): 1.875
Skew: 0.234 Prob(JB): 0.392
Kurtosis: 2.519 Cond. No. 144.
==============================================================================
您可以看到 statsmodels 提供了更多详细信息,例如 AIC、BIC、t-statistics 等
需要 运行 多个单因素(单变量)回归模型 python 在数据框中的一列和同一数据框中的其他几个列之间
-
所以根据图像,我想运行 x1 & dep、x2 & dep 等等之间的回归模型
想要输出 - beta、截距、R-sq、p 值、SSE、AIC、BIC、残差的正态性检验等
您可以在此处使用两个选项。一个是流行的 scikit-learn 库。用法如下
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y) # where X is your feature data and y is your target
reg.score(X, y) # R^2 value
>>> 0.87
reg.coef_ # slope coeficients
>>> array([1.45, -9.2])
reg.intercept_ # intercept
>>> 6.1723...
没有多少其他统计信息可以与 scikit 一起使用。
另一个选项是 statsmodels,它提供了更丰富的模型统计信息细节
import numpy as np
import statsmodels.api as sm
# generate some synthetic data
nsample = 100
x = np.linspace(0, 10, 100)
X = np.column_stack((x, x**2))
beta = np.array([1, 0.1, 10])
e = np.random.normal(size=nsample)
X = sm.add_constant(X)
y = np.dot(X, beta) + e
# fit the model and get a summary of the statistics
model = sm.OLS(y, X)
results = model.fit()
print(results.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
Method: Least Squares F-statistic: 4.020e+06
Date: Mon, 08 Jul 2019 Prob (F-statistic): 2.83e-239
Time: 02:07:22 Log-Likelihood: -146.51
No. Observations: 100 AIC: 299.0
Df Residuals: 97 BIC: 306.8
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 1.3423 0.313 4.292 0.000 0.722 1.963
x1 -0.0402 0.145 -0.278 0.781 -0.327 0.247
x2 10.0103 0.014 715.745 0.000 9.982 10.038
==============================================================================
Omnibus: 2.042 Durbin-Watson: 2.274
Prob(Omnibus): 0.360 Jarque-Bera (JB): 1.875
Skew: 0.234 Prob(JB): 0.392
Kurtosis: 2.519 Cond. No. 144.
==============================================================================
您可以看到 statsmodels 提供了更多详细信息,例如 AIC、BIC、t-statistics 等