如何在绘制 R2 值时使趋势线穿过原点 - python
How to make trend line go through the origin while plotting its R2 value - python
我正在使用一个数据框 df
,它看起来像这样:
index var1 var2 var3
0 0.0 0.0 0.0
10 43940.7 2218.3 6581.7
100 429215.0 16844.3 51682.7
我想绘制每个变量,绘制它们强制到原点的趋势线,计算并绘制 R2 值。
我在 中找到了我想要的东西,但是趋势线没有穿过原点,我找不到让它起作用的方法。
我尝试手动修改趋势线第一个点的值,但结果似乎不太好。
for var in df.columns[1:]:
fig, ax = plt.subplots(figsize=(10,7))
x = df.index
y = df[var]
z = numpy.polyfit(x, y, 1)
p = numpy.poly1d(z)
pylab.plot(x,p(x),"r--")
plt.plot(x,y,"+", ms=10, mec="k")
z = np.polyfit(x, y, 1)
y_hat = np.poly1d(z)(x)
y_hat[0] = 0 ###--- Here I tried to replace the first value with 0 but it doesn't seem right to me.
plt.plot(x, y_hat, "r--", lw=1)
text = f"$y={z[0]:0.3f}\;x{z[1]:+0.3f}$\n$R^2 = {r2_score(y,y_hat):0.3f}$"
plt.gca().text(0.05, 0.95, text,transform=plt.gca().transAxes, fontsize=14, verticalalignment='top')
有什么办法吗?任何帮助将不胜感激。
您可以使用 Scipy 和 curve_fit。确定您的趋势线为 y=ax,以便它穿过原点。
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a):
return a * x
xdata = (0,10,20,30,40)
ydata = (0,12,18,35,38)
popt, pcov = curve_fit(func, xdata, ydata)
plt.scatter(xdata, ydata)
plt.plot(xdata, func(xdata, popt),"r--")
plt.show()
您可以使用 statsmodels
进行无截距的简单线性回归
import statsmodels.api as sm
xdata = [0,10,20,30,40]
ydata = [0,12,18,35,38]
res = sm.OLS(ydata, xdata).fit()
然后将斜率和 R2 存储在属性中:
res.params
#array([1.01666667])
res.rsquared
#0.9884709382637339
还有大量其他信息:
res.summary()
OLS Regression Results
=======================================================================================
Dep. Variable: y R-squared (uncentered): 0.988
Model: OLS Adj. R-squared (uncentered): 0.986
Method: Least Squares F-statistic: 342.9
Date: Tue, 29 Sep 2020 Prob (F-statistic): 5.00e-05
Time: 15:39:50 Log-Likelihood: -12.041
No. Observations: 5 AIC: 26.08
Df Residuals: 4 BIC: 25.69
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 1.0167 0.055 18.519 0.000 0.864 1.169
==============================================================================
我正在使用一个数据框 df
,它看起来像这样:
index var1 var2 var3
0 0.0 0.0 0.0
10 43940.7 2218.3 6581.7
100 429215.0 16844.3 51682.7
我想绘制每个变量,绘制它们强制到原点的趋势线,计算并绘制 R2 值。
我在
我尝试手动修改趋势线第一个点的值,但结果似乎不太好。
for var in df.columns[1:]:
fig, ax = plt.subplots(figsize=(10,7))
x = df.index
y = df[var]
z = numpy.polyfit(x, y, 1)
p = numpy.poly1d(z)
pylab.plot(x,p(x),"r--")
plt.plot(x,y,"+", ms=10, mec="k")
z = np.polyfit(x, y, 1)
y_hat = np.poly1d(z)(x)
y_hat[0] = 0 ###--- Here I tried to replace the first value with 0 but it doesn't seem right to me.
plt.plot(x, y_hat, "r--", lw=1)
text = f"$y={z[0]:0.3f}\;x{z[1]:+0.3f}$\n$R^2 = {r2_score(y,y_hat):0.3f}$"
plt.gca().text(0.05, 0.95, text,transform=plt.gca().transAxes, fontsize=14, verticalalignment='top')
有什么办法吗?任何帮助将不胜感激。
您可以使用 Scipy 和 curve_fit。确定您的趋势线为 y=ax,以便它穿过原点。
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a):
return a * x
xdata = (0,10,20,30,40)
ydata = (0,12,18,35,38)
popt, pcov = curve_fit(func, xdata, ydata)
plt.scatter(xdata, ydata)
plt.plot(xdata, func(xdata, popt),"r--")
plt.show()
您可以使用 statsmodels
进行无截距的简单线性回归
import statsmodels.api as sm
xdata = [0,10,20,30,40]
ydata = [0,12,18,35,38]
res = sm.OLS(ydata, xdata).fit()
然后将斜率和 R2 存储在属性中:
res.params
#array([1.01666667])
res.rsquared
#0.9884709382637339
还有大量其他信息:
res.summary()
OLS Regression Results
=======================================================================================
Dep. Variable: y R-squared (uncentered): 0.988
Model: OLS Adj. R-squared (uncentered): 0.986
Method: Least Squares F-statistic: 342.9
Date: Tue, 29 Sep 2020 Prob (F-statistic): 5.00e-05
Time: 15:39:50 Log-Likelihood: -12.041
No. Observations: 5 AIC: 26.08
Df Residuals: 4 BIC: 25.69
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 1.0167 0.055 18.519 0.000 0.864 1.169
==============================================================================