有没有办法从scipy.stats.norm.fit中获取拟合参数的错误?
Is there a way to get the error in fitting parameters from scipy.stats.norm.fit?
我有一些数据,我使用 scipy.stats.normal 对象拟合函数对它们进行了正态分布拟合,如下所示:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import matplotlib.mlab as mlab
x = np.random.normal(size=50000)
fig, ax = plt.subplots()
nbins = 75
mu, sigma = norm.fit(x)
n, bins, patches = ax.hist(x,nbins,normed=1,facecolor = 'grey', alpha = 0.5, label='before');
y0 = mlab.normpdf(bins, mu, sigma) # Line of best fit
ax.plot(bins,y0,'k--',linewidth = 2, label='fit before')
ax.set_title('$\mu$={}, $\sigma$={}'.format(mu, sigma))
plt.show()
我现在想提取拟合的 mu 和 sigma 值中的 uncertainty/error。我该怎么做?
您可以使用 scipy.optimize.curve_fit
:
该方法不仅 return 估计最优值
参数,还有对应的协方差矩阵:
popt : array
Optimal values for the parameters so that the sum of the squared residuals
of f(xdata, *popt) - ydata is minimized
pcov : 2d array
The estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).
How the sigma parameter affects the estimated covariance depends on absolute_sigma argument, as described above.
If the Jacobian matrix at the solution doesn’t have a full rank, then ‘lm’ method returns a matrix filled with np.inf, on the other hand ‘trf’ and ‘dogbox’ methods use Moore-Penrose pseudoinverse to compute the covariance matrix.
您可以从协方差矩阵的对角线元素的平方根计算参数的标准差误差,如下所示:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.optimize import curve_fit
x = np.random.normal(size=50000)
fig, ax = plt.subplots()
nbins = 75
n, bins, patches = ax.hist(x,nbins, density=True, facecolor = 'grey', alpha = 0.5, label='before');
centers = (0.5*(bins[1:]+bins[:-1]))
pars, cov = curve_fit(lambda x, mu, sig : norm.pdf(x, loc=mu, scale=sig), centers, n, p0=[0,1])
ax.plot(centers, norm.pdf(centers,*pars), 'k--',linewidth = 2, label='fit before')
ax.set_title('$\mu={:.4f}\pm{:.4f}$, $\sigma={:.4f}\pm{:.4f}$'.format(pars[0],np.sqrt(cov[0,0]), pars[1], np.sqrt(cov[1,1 ])))
plt.show()
结果如下图:
另见 lmfit (https://github.com/lmfit/lmfit-py) which gives an easier interface and reports uncertainties in fitted variables. To fit data to a normal distribution, see http://lmfit.github.io/lmfit-py/builtin_models.html#example-1-fit-peak-data-to-gaussian-lorentzian-and-voigt-profiles
并使用类似
的东西
from lmfit.models import GaussianModel
model = GaussianModel()
# create parameters with initial guesses:
params = model.make_params(center=9, amplitude=40, sigma=1)
result = model.fit(ydata, params, x=xdata)
print(result.fit_report())
该报告将包含 1-sigma 错误,例如
[[Variables]]
sigma: 1.23218358 +/- 0.007374 (0.60%) (init= 1.0)
center: 9.24277047 +/- 0.007374 (0.08%) (init= 9.0)
amplitude: 30.3135620 +/- 0.157126 (0.52%) (init= 40.0)
fwhm: 2.90157055 +/- 0.017366 (0.60%) == '2.3548200*sigma'
height: 9.81457817 +/- 0.050872 (0.52%) == '0.3989423*amplitude/max(1.e-15, sigma)'
我有一些数据,我使用 scipy.stats.normal 对象拟合函数对它们进行了正态分布拟合,如下所示:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import matplotlib.mlab as mlab
x = np.random.normal(size=50000)
fig, ax = plt.subplots()
nbins = 75
mu, sigma = norm.fit(x)
n, bins, patches = ax.hist(x,nbins,normed=1,facecolor = 'grey', alpha = 0.5, label='before');
y0 = mlab.normpdf(bins, mu, sigma) # Line of best fit
ax.plot(bins,y0,'k--',linewidth = 2, label='fit before')
ax.set_title('$\mu$={}, $\sigma$={}'.format(mu, sigma))
plt.show()
我现在想提取拟合的 mu 和 sigma 值中的 uncertainty/error。我该怎么做?
您可以使用 scipy.optimize.curve_fit
:
该方法不仅 return 估计最优值
参数,还有对应的协方差矩阵:
popt : array
Optimal values for the parameters so that the sum of the squared residuals of f(xdata, *popt) - ydata is minimized
pcov : 2d array
The estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).
How the sigma parameter affects the estimated covariance depends on absolute_sigma argument, as described above.
If the Jacobian matrix at the solution doesn’t have a full rank, then ‘lm’ method returns a matrix filled with np.inf, on the other hand ‘trf’ and ‘dogbox’ methods use Moore-Penrose pseudoinverse to compute the covariance matrix.
您可以从协方差矩阵的对角线元素的平方根计算参数的标准差误差,如下所示:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.optimize import curve_fit
x = np.random.normal(size=50000)
fig, ax = plt.subplots()
nbins = 75
n, bins, patches = ax.hist(x,nbins, density=True, facecolor = 'grey', alpha = 0.5, label='before');
centers = (0.5*(bins[1:]+bins[:-1]))
pars, cov = curve_fit(lambda x, mu, sig : norm.pdf(x, loc=mu, scale=sig), centers, n, p0=[0,1])
ax.plot(centers, norm.pdf(centers,*pars), 'k--',linewidth = 2, label='fit before')
ax.set_title('$\mu={:.4f}\pm{:.4f}$, $\sigma={:.4f}\pm{:.4f}$'.format(pars[0],np.sqrt(cov[0,0]), pars[1], np.sqrt(cov[1,1 ])))
plt.show()
结果如下图:
另见 lmfit (https://github.com/lmfit/lmfit-py) which gives an easier interface and reports uncertainties in fitted variables. To fit data to a normal distribution, see http://lmfit.github.io/lmfit-py/builtin_models.html#example-1-fit-peak-data-to-gaussian-lorentzian-and-voigt-profiles
并使用类似
的东西from lmfit.models import GaussianModel
model = GaussianModel()
# create parameters with initial guesses:
params = model.make_params(center=9, amplitude=40, sigma=1)
result = model.fit(ydata, params, x=xdata)
print(result.fit_report())
该报告将包含 1-sigma 错误,例如
[[Variables]]
sigma: 1.23218358 +/- 0.007374 (0.60%) (init= 1.0)
center: 9.24277047 +/- 0.007374 (0.08%) (init= 9.0)
amplitude: 30.3135620 +/- 0.157126 (0.52%) (init= 40.0)
fwhm: 2.90157055 +/- 0.017366 (0.60%) == '2.3548200*sigma'
height: 9.81457817 +/- 0.050872 (0.52%) == '0.3989423*amplitude/max(1.e-15, sigma)'