python 中的单变量回归

Question

需要运行多个单因素（单变量）回归模型 python 在数据框中的一列和同一数据框中的其他几个列之间

-

所以根据图像，我想运行 x1 & dep、x2 & dep 等等之间的回归模型

想要输出 - beta、截距、R-sq、p 值、SSE、AIC、BIC、残差的正态性检验等

Answer 1

您可以在此处使用两个选项。一个是流行的 scikit-learn 库。用法如下

from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y)  # where X is your feature data and y is your target
reg.score(X, y)  # R^2 value
>>> 0.87
reg.coef_  # slope coeficients
>>> array([1.45, -9.2])
reg.intercept_  # intercept
>>> 6.1723...

没有多少其他统计信息可以与 scikit 一起使用。

另一个选项是 statsmodels，它提供了更丰富的模型统计信息细节

import numpy as np
import statsmodels.api as sm

# generate some synthetic data
nsample = 100
x = np.linspace(0, 10, 100)
X = np.column_stack((x, x**2))
beta = np.array([1, 0.1, 10])
e = np.random.normal(size=nsample)

X = sm.add_constant(X)
y = np.dot(X, beta) + e

# fit the model and get a summary of the statistics
model = sm.OLS(y, X)
results = model.fit()
print(results.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       1.000
Model:                            OLS   Adj. R-squared:                  1.000
Method:                 Least Squares   F-statistic:                 4.020e+06
Date:                Mon, 08 Jul 2019   Prob (F-statistic):          2.83e-239
Time:                        02:07:22   Log-Likelihood:                -146.51
No. Observations:                 100   AIC:                             299.0
Df Residuals:                      97   BIC:                             306.8
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          1.3423      0.313      4.292      0.000       0.722       1.963
x1            -0.0402      0.145     -0.278      0.781      -0.327       0.247
x2            10.0103      0.014    715.745      0.000       9.982      10.038
==============================================================================
Omnibus:                        2.042   Durbin-Watson:                   2.274
Prob(Omnibus):                  0.360   Jarque-Bera (JB):                1.875
Skew:                           0.234   Prob(JB):                        0.392
Kurtosis:                       2.519   Cond. No.                         144.
==============================================================================

您可以看到 statsmodels 提供了更多详细信息，例如 AIC、BIC、t-statistics 等

python 中的单变量回归

univariate regression in python

python-3.x

pandas

jupyter-lab