有谁知道如何绘制python中的线性回归曲线或相关函数来绘制它?
Does anyone know how to draw the linear regression curve in python or related functions to draw it?
我刚开始学习Python
,想知道如何用时间序列的价格数据(例如,收盘价,有 只有 y 个因素)
import pandas as pd
import pandas_datareader.data as web
import matplotlib.pyplot as plt
from datetime import datetime
start=datetime(2015,1,1)
end=datetime(2015,12,31)
df = web.DataReader("AMZN", "yahoo", start, end)
close = df['Close']
参考了这篇web page,掌握了绘制线性回归曲线的基本思路,但不知道用什么函数写了一遍python
。
试试这个:
import pandas_datareader.data as web
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
start = datetime(2015, 1, 1)
end = datetime(2015, 12, 31)
df = web.DataReader("AMZN", "yahoo", start, end)
df['day'] = df.index.map(lambda observation_timestamp: observation_timestamp.dayofyear)
y = df.Close
X = df.day
X = sm.add_constant(X)
est = sm.OLS(y, X)
est = est.fit()
X_prime = np.linspace(X.day.min(), X.day.max(), 100)
X_prime = sm.add_constant(X_prime)
y_hat = est.predict(X_prime)
plt.plot(X_prime[:,1], y_hat)
plt.scatter(X.day, y)
plt.show()
执行这个 est.summary()
:
OLS Regression Results
==============================================================================
Dep. Variable: Close R-squared: 0.935
Model: OLS Adj. R-squared: 0.934
Method: Least Squares F-statistic: 3570.
Date: Mon, 05 Dec 2016 Prob (F-statistic): 5.06e-150
Time: 00:27:53 Log-Likelihood: -1199.8
No. Observations: 252 AIC: 2404.
Df Residuals: 250 BIC: 2411.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
const 289.9491 3.622 80.053 0.000 282.816 297.083
day 1.0212 0.017 59.748 0.000 0.988 1.055
==============================================================================
Omnibus: 15.313 Durbin-Watson: 0.117
Prob(Omnibus): 0.000 Jarque-Bera (JB): 6.134
Skew: 0.007 Prob(JB): 0.0466
Kurtosis: 2.236 Cond. No. 429.
==============================================================================
另一个例子:
import pandas_datareader.data as web
from datetime import datetime
import statsmodels.api as sm
from patsy.highlevel import dmatrices
import matplotlib.pyplot as plt
start = datetime(2015, 1, 1)
end = datetime(2015, 12, 31)
df = web.DataReader("AMZN", "yahoo", start, end)
df['day'] = df.index.map(lambda observation_timestamp: observation_timestamp.dayofyear)
y, X = dmatrices('Close ~ day', data=df, return_type='dataframe')
mod = sm.OLS(y, X)
res = mod.fit()
sm.stats.linear_rainbow(res)
sm.graphics.plot_regress_exog(res, "day")
plt.show()
已将 sm.graphics.plot_regress_exog(res, "day")
更改为 sm.graphics.plot_fit(res, "day")
执行这个:res.summary()
OLS Regression Results
==============================================================================
Dep. Variable: Close R-squared: 0.935
Model: OLS Adj. R-squared: 0.934
Method: Least Squares F-statistic: 3570.
Date: Mon, 05 Dec 2016 Prob (F-statistic): 5.06e-150
Time: 00:26:04 Log-Likelihood: -1199.8
No. Observations: 252 AIC: 2404.
Df Residuals: 250 BIC: 2411.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
Intercept 289.9491 3.622 80.053 0.000 282.816 297.083
day 1.0212 0.017 59.748 0.000 0.988 1.055
==============================================================================
Omnibus: 15.313 Durbin-Watson: 0.117
Prob(Omnibus): 0.000 Jarque-Bera (JB): 6.134
Skew: 0.007 Prob(JB): 0.0466
Kurtosis: 2.236 Cond. No. 429.
==============================================================================
我刚开始学习Python
,想知道如何用时间序列的价格数据(例如,收盘价,有 只有 y 个因素)
import pandas as pd
import pandas_datareader.data as web
import matplotlib.pyplot as plt
from datetime import datetime
start=datetime(2015,1,1)
end=datetime(2015,12,31)
df = web.DataReader("AMZN", "yahoo", start, end)
close = df['Close']
参考了这篇web page,掌握了绘制线性回归曲线的基本思路,但不知道用什么函数写了一遍python
。
试试这个:
import pandas_datareader.data as web
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
start = datetime(2015, 1, 1)
end = datetime(2015, 12, 31)
df = web.DataReader("AMZN", "yahoo", start, end)
df['day'] = df.index.map(lambda observation_timestamp: observation_timestamp.dayofyear)
y = df.Close
X = df.day
X = sm.add_constant(X)
est = sm.OLS(y, X)
est = est.fit()
X_prime = np.linspace(X.day.min(), X.day.max(), 100)
X_prime = sm.add_constant(X_prime)
y_hat = est.predict(X_prime)
plt.plot(X_prime[:,1], y_hat)
plt.scatter(X.day, y)
plt.show()
执行这个 est.summary()
:
OLS Regression Results
==============================================================================
Dep. Variable: Close R-squared: 0.935
Model: OLS Adj. R-squared: 0.934
Method: Least Squares F-statistic: 3570.
Date: Mon, 05 Dec 2016 Prob (F-statistic): 5.06e-150
Time: 00:27:53 Log-Likelihood: -1199.8
No. Observations: 252 AIC: 2404.
Df Residuals: 250 BIC: 2411.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
const 289.9491 3.622 80.053 0.000 282.816 297.083
day 1.0212 0.017 59.748 0.000 0.988 1.055
==============================================================================
Omnibus: 15.313 Durbin-Watson: 0.117
Prob(Omnibus): 0.000 Jarque-Bera (JB): 6.134
Skew: 0.007 Prob(JB): 0.0466
Kurtosis: 2.236 Cond. No. 429.
==============================================================================
另一个例子:
import pandas_datareader.data as web
from datetime import datetime
import statsmodels.api as sm
from patsy.highlevel import dmatrices
import matplotlib.pyplot as plt
start = datetime(2015, 1, 1)
end = datetime(2015, 12, 31)
df = web.DataReader("AMZN", "yahoo", start, end)
df['day'] = df.index.map(lambda observation_timestamp: observation_timestamp.dayofyear)
y, X = dmatrices('Close ~ day', data=df, return_type='dataframe')
mod = sm.OLS(y, X)
res = mod.fit()
sm.stats.linear_rainbow(res)
sm.graphics.plot_regress_exog(res, "day")
plt.show()
已将 sm.graphics.plot_regress_exog(res, "day")
更改为 sm.graphics.plot_fit(res, "day")
执行这个:res.summary()
OLS Regression Results
==============================================================================
Dep. Variable: Close R-squared: 0.935
Model: OLS Adj. R-squared: 0.934
Method: Least Squares F-statistic: 3570.
Date: Mon, 05 Dec 2016 Prob (F-statistic): 5.06e-150
Time: 00:26:04 Log-Likelihood: -1199.8
No. Observations: 252 AIC: 2404.
Df Residuals: 250 BIC: 2411.
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
Intercept 289.9491 3.622 80.053 0.000 282.816 297.083
day 1.0212 0.017 59.748 0.000 0.988 1.055
==============================================================================
Omnibus: 15.313 Durbin-Watson: 0.117
Prob(Omnibus): 0.000 Jarque-Bera (JB): 6.134
Skew: 0.007 Prob(JB): 0.0466
Kurtosis: 2.236 Cond. No. 429.
==============================================================================