如何为两组数据创建 1 个线性回归
How to create 1 linear regression for two groups of data
我在一个图上放置了两个散点图。我想找到 y1 和 y2 组合点的线性回归线(如 x 和 (y1 and y2) 之间的回归),但我遇到了困难,因为我通常只能分别找到 y1 或 y2 的回归线.我还想找到 r^2 值(对于组合的 y1 和 y2)。如果能得到任何帮助,我将不胜感激!
df1 = pd.DataFrame(np.random.randint(0,100,size=(15, 2)), columns=list('AB'))
y1 = df1['A']
y2 = df1['B']
plt.scatter(df1.index, y1)
plt.scatter(df1.index, y2)
plt.show()
听起来您想 'stack' 列 A
和 B
放在一起;有很多方法可以做到这一点,这里是一种使用 stack
:
df2 = df1.stack().rename('A_and_B').reset_index(level = 1, drop = True).to_frame()
然后 df.head()
看起来像这样:
A_and_B
0 35
0 58
1 49
1 73
2 44
和散点图:
plt.scatter(df2.index, df2['A_and_B'])
看起来像
我不知道你是怎么做回归的,你现在可以把你的方法应用到df2
。例如:
import statsmodels.api as sm
res = sm.OLS(df2['A_and_B'], df2.index).fit()
res.summary()
输出:
OLS Regression Results
=======================================================================================
Dep. Variable: A_and_B R-squared (uncentered): 0.517
Model: OLS Adj. R-squared (uncentered): 0.501
Method: Least Squares F-statistic: 31.10
Date: Mon, 14 Mar 2022 Prob (F-statistic): 5.11e-06
Time: 23:02:47 Log-Likelihood: -152.15
No. Observations: 30 AIC: 306.3
Df Residuals: 29 BIC: 307.7
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 4.8576 0.871 5.577 0.000 3.076 6.639
==============================================================================
Omnibus: 3.466 Durbin-Watson: 1.244
Prob(Omnibus): 0.177 Jarque-Bera (JB): 1.962
Skew: -0.371 Prob(JB): 0.375
Kurtosis: 1.990 Cond. No. 1.00
==============================================================================
Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.
我在一个图上放置了两个散点图。我想找到 y1 和 y2 组合点的线性回归线(如 x 和 (y1 and y2) 之间的回归),但我遇到了困难,因为我通常只能分别找到 y1 或 y2 的回归线.我还想找到 r^2 值(对于组合的 y1 和 y2)。如果能得到任何帮助,我将不胜感激!
df1 = pd.DataFrame(np.random.randint(0,100,size=(15, 2)), columns=list('AB'))
y1 = df1['A']
y2 = df1['B']
plt.scatter(df1.index, y1)
plt.scatter(df1.index, y2)
plt.show()
听起来您想 'stack' 列 A
和 B
放在一起;有很多方法可以做到这一点,这里是一种使用 stack
:
df2 = df1.stack().rename('A_and_B').reset_index(level = 1, drop = True).to_frame()
然后 df.head()
看起来像这样:
A_and_B
0 35
0 58
1 49
1 73
2 44
和散点图:
plt.scatter(df2.index, df2['A_and_B'])
看起来像
我不知道你是怎么做回归的,你现在可以把你的方法应用到df2
。例如:
import statsmodels.api as sm
res = sm.OLS(df2['A_and_B'], df2.index).fit()
res.summary()
输出:
OLS Regression Results
=======================================================================================
Dep. Variable: A_and_B R-squared (uncentered): 0.517
Model: OLS Adj. R-squared (uncentered): 0.501
Method: Least Squares F-statistic: 31.10
Date: Mon, 14 Mar 2022 Prob (F-statistic): 5.11e-06
Time: 23:02:47 Log-Likelihood: -152.15
No. Observations: 30 AIC: 306.3
Df Residuals: 29 BIC: 307.7
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 4.8576 0.871 5.577 0.000 3.076 6.639
==============================================================================
Omnibus: 3.466 Durbin-Watson: 1.244
Prob(Omnibus): 0.177 Jarque-Bera (JB): 1.962
Skew: -0.371 Prob(JB): 0.375
Kurtosis: 1.990 Cond. No. 1.00
==============================================================================
Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.