如何使用 lmfit 的优化参数找到函数的面积(伪 Voigt)?
How to find the area of a function (Pseudo Voigt) using optimized parameters from lmfit?
我正在尝试确定曲线(峰值)的面积。我能够使用伪 Voigt 曲线和指数背景成功拟合峰值(数据),并获得与使用商业软件获得的参数一致的拟合参数。现在的问题是试图将那些拟合的峰参数与峰面积相关联。
我找不到使用拟合参数计算峰面积的简单方法,这与高斯线形的情况不同。所以我正在尝试使用 scipy quad 函数来集成我的拟合函数。我知道商业软件确定的面积应该在 19,000 左右,但我得到的值非常大,不正确。
拟合效果很好(通过绘图确认...)但计算面积并不接近。在尝试使用传递给它的最佳拟合值绘制 my psuedo_voigt_func 函数后,我发现它的峰值太强了。这样,集成可能是正确的,那么错误将出现在我如何通过将拟合参数传递给我的 psuedo_voigt_func 函数来创建我的峰值,其中该函数是从 lmfit 模型网站转录的(https://lmfit.github.io/lmfit-py/builtin_models.html).我相信我正确地编写了 psuedo voigt 函数的脚本,但它不起作用。
#modules
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from lmfit.models import GaussianModel, LinearModel, VoigtModel, Pearson7Model, ExponentialModel, PseudoVoigtModel
from scipy.integrate import quad
#data
x = np.array([33.05, 33.1 , 33.15, 33.2 , 33.25, 33.3 , 33.35, 33.4 , 33.45, 33.5 , 33.55, 33.6 , 33.65, 33.7 , 33.75, 33.8 , 33.85, 33.9 , 33.95, 34. , 34.05, 34.1 , 34.15, 34.2 , 34.25, 34.3 , 34.35, 34.4 , 34.45, 34.5 , 34.55, 34.6 , 34.65, 34.7 , 34.75, 34.8 , 34.85, 34.9 , 34.95, 35. , 35.05, 35.1 , 35.15, 35.2 , 35.25, 35.3 , 35.35, 35.4 , 35.45, 35.5 , 35.55, 35.6 , 35.65, 35.7 , 35.75, 35.8 , 35.85, 35.9 , 35.95, 36. , 36.05, 36.1 , 36.15, 36.2 , 36.25, 36.3 , 36.35, 36.4 , 36.45])
y = np.array([4569, 4736, 4610, 4563, 4639, 4574, 4619, 4473, 4488, 4512, 4474, 4640, 4691, 4621, 4671, 4749, 4657, 4751, 4921, 5003, 5071, 5041, 5121, 5165, 5352, 5304, 5408, 5393, 5544, 5625, 5859, 5851, 6155, 6647, 7150, 7809, 9017, 10967, 14122, 19529, 28029, 39535, 50684, 55730, 52525, 41356, 30015, 20345, 14368, 10736, 9012, 7765, 7064, 6336, 6011, 5806, 5461, 5283, 5224, 5221, 4895, 4980, 4895, 4852, 4889, 4821, 4872, 4802, 4928])
#model
bkg_model = ExponentialModel(prefix='bkg_') #BACKGROUND model
peak_model = PseudoVoigtModel(prefix='peak_') #PEAK model
model = peak_model + bkg_model
#parameters
pars = bkg_model.guess(y, x=x) #BACKGROUND parameters
pars.update(peak_model.make_params()) #PEAK parameters
pars['peak_amplitude'].set(value=17791.293, min=0)
pars['peak_center'].set(value=35.2, min=0, max=91)
pars['peak_sigma'].set(value=0.05, min=0)
#fitting
init = model.eval(pars, x=x) #initial parameters
out = model.fit(y, pars, x=x) #fitting
#integration part
def psuedo_voigt_func(x, amp, cen, sig, alpha):
sig_gauss = sig / np.sqrt(2*np.log(2))
term1 = (amp * (1-alpha)) / (sig_gauss * np.sqrt(2*np.pi))
term2 = np.exp(-(x-cen)**2) / (2 * sig_gauss**2)
term3 = ((amp*alpha) / np.pi) * ( sig / ((x-cen)**2) + sig**2)
psuedo_voigt = (term1 * term2) + term3
return psuedo_voigt
fitted_amp = out.best_values['peak_amplitude']
fitted_cen = out.best_values['peak_center']
fitted_sig = out.best_values['peak_sigma']
fitted_alpha = out.best_values['peak_fraction']
print(quad(psuedo_voigt_func, min(x), max(x), args=(fitted_amp, fitted_cen, fitted_sig, fitted_alpha)))
#output result of fit:
[[Model]]
(Model(pvoigt, prefix='peak_') + Model(exponential, prefix='bkg_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 542
# data points = 69
# variables = 6
chi-square = 2334800.79
reduced chi-square = 37060.3300
Akaike info crit = 731.623813
Bayesian info crit = 745.028452
[[Variables]]
bkg_amplitude: 4439.34760 +/- 819.477320 (18.46%) (init = 10.30444)
bkg_decay: -38229822.8 +/- 5.6275e+12 (14720258.94%) (init = -5.314193)
peak_amplitude: 19868.0711 +/- 106.363477 (0.54%) (init = 17791.29)
peak_center: 35.2039076 +/- 3.3971e-04 (0.00%) (init = 35.2)
peak_sigma: 0.14358871 +/- 5.6049e-04 (0.39%) (init = 0.05)
peak_fraction: 0.62733180 +/- 0.01233108 (1.97%) (init = 0.5)
peak_fwhm: 0.28717742 +/- 0.00112155 (0.39%) == '2.0000000*peak_sigma'
peak_height: 51851.3174 +/- 141.903066 (0.27%) == '(((1 peak_fraction)*peak_amplitude)/max(2.220446049250313e-16, (peak_sigma*sqrt(pi/log(2))))+(peak_fraction*peak_amplitude /max(2.220446049250313e-16, (pi*peak_sigma)))'
[[Correlations]] (unreported correlations are < 0.100)
C(bkg_amplitude, bkg_decay) = -0.999
C(peak_amplitude, peak_fraction) = 0.838
C(peak_sigma, peak_fraction) = -0.481
C(bkg_decay, peak_amplitude) = -0.338
C(bkg_amplitude, peak_amplitude) = 0.310
C(bkg_decay, peak_fraction) = -0.215
C(bkg_amplitude, peak_fraction) = 0.191
C(bkg_decay, peak_center) = -0.183
C(bkg_amplitude, peak_center) = 0.183
C(bkg_amplitude, peak_sigma) = 0.139
C(bkg_decay, peak_sigma) = -0.137
#output of integration:
(4015474.293103768, 3509959.3601876567)
C:/Users/script.py:126: IntegrationWarning: The integral is probably divergent, or slowly convergent.
与 lmfit
中的其他峰状线形和模型一样,amplitude
参数应给出该组件的面积。
就其价值而言,最好使用真正的 Voigt 函数而不是伪 Voigt 函数。
OP 的 pseudo_voigt
格式不正确,但似乎也没有错,但是 pseudo_voigt
有不同的定义。下面我从维基百科实现了一个(link 在代码中),它通常会产生很好的结果。然而,从对数尺度来看,这个数据并不是很好。我还使用复杂的定义来获得真正的 Voigtusing Fedeeva
函数,如 lmfit
。
代码如下:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
from scipy.special import wofz
from scipy.integrate import quad
def cauchy(x, x0, g):
return 1. / ( np.pi * g * ( 1 + ( ( x - x0 ) / g )**2 ) )
def gauss( x, x0, s):
return 1./ np.sqrt(2 * np.pi * s**2 ) * np.exp( - (x-x0)**2 / ( 2 * s**2 ) )
# https://en.wikipedia.org/wiki/Voigt_profile#Numeric_approximations
def pseudo_voigt( x, x0, s, g, a, y0 ):
fg = 2 * s * np.sqrt( 2 * np.log(2) )
fl = 2 * g
f = ( fg**5 + 2.69269 * fg**4 * fl + 2.42843 * fg**3 * fl**2 + 4.47163 * fg**2 * fl**3 + 0.07842 * fg * fl**4+ fl**5)**(1./5.)
eta = 1.36603 * ( fl / f ) - 0.47719 * ( fl / f )**2 + 0.11116 * ( f / fl )**3
return y0 + a * ( eta * cauchy( x, x0, f) + ( 1 - eta ) * gauss( x, x0, f ) )
def voigt( x, s, g):
z = ( x + 1j * g ) / ( s * np.sqrt( 2. ) )
v = wofz( z ) #Feddeeva
out = np.real( v ) / s / np.sqrt( 2 * np.pi )
return out
def fitfun( x, x0, s, g, a, y0 ):
return y0 + a * voigt( x - x0, s, g )
if __name__ == '__main__':
xlist = np.array( [ 33.05, 33.1 , 33.15, 33.2 , 33.25, 33.3 , 33.35, 33.4 , 33.45, 33.5 , 33.55, 33.6 , 33.65, 33.7 , 33.75, 33.8 , 33.85, 33.9 , 33.95, 34. , 34.05, 34.1 , 34.15, 34.2 , 34.25, 34.3 , 34.35, 34.4 , 34.45, 34.5 , 34.55, 34.6 , 34.65, 34.7 , 34.75, 34.8 , 34.85, 34.9 , 34.95, 35. , 35.05, 35.1 , 35.15, 35.2 , 35.25, 35.3 , 35.35, 35.4 , 35.45, 35.5 , 35.55, 35.6 , 35.65, 35.7 , 35.75, 35.8 , 35.85, 35.9 , 35.95, 36. , 36.05, 36.1 , 36.15, 36.2 , 36.25, 36.3 , 36.35, 36.4 , 36.45])
ylist = np.array( [ 4569, 4736, 4610, 4563, 4639, 4574, 4619, 4473, 4488, 4512, 4474, 4640, 4691, 4621, 4671, 4749, 4657, 4751, 4921, 5003, 5071, 5041, 5121, 5165, 5352, 5304, 5408, 5393, 5544, 5625, 5859, 5851, 6155, 6647, 7150, 7809, 9017, 10967, 14122, 19529, 28029, 39535, 50684, 55730, 52525, 41356, 30015, 20345, 14368, 10736, 9012, 7765, 7064, 6336, 6011, 5806, 5461, 5283, 5224, 5221, 4895, 4980, 4895, 4852, 4889, 4821, 4872, 4802, 4928])
sol, err = curve_fit( pseudo_voigt, xlist, ylist, p0=[ 35.25,.05,.05, 30000., 3000] )
solv, errv = curve_fit( fitfun, xlist, ylist, p0=[ 35.25,.05,.05, 20000., 3000] )
print solv
xth = np.linspace( xlist[0], xlist[-1], 500)
yth = np.fromiter( ( pseudo_voigt(x ,*sol) for x in xth ), np.float )
yv = np.fromiter( ( fitfun(x ,*solv) for x in xth ), np.float )
print( quad(pseudo_voigt, xlist[0], xlist[-1], args=tuple( sol ) ) )
print( quad(fitfun, xlist[0], xlist[-1], args=tuple( solv ) ) )
solvNoBack = solv
solvNoBack[-1] =0
print( quad(fitfun, xlist[0], xlist[-1], args=tuple( solvNoBack ) ) )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( xlist, ylist, marker='o', linestyle='', label='data' )
ax.plot( xth, yth, label='pseudo' )
ax.plot( xth, yv, label='voigt with hack' )
ax.set_yscale('log')
plt.legend( loc=0 )
plt.show()
提供:
[3.52039054e+01 8.13244777e-02 7.80206967e-02 1.96178358e+04 4.48314849e+03]
(34264.98814344757, 0.00017531957481189617)
(34241.971907301166, 0.0002019796740206914)
(18999.266974139795, 0.0002019796990069267)
从图中可以明显看出 pseudo_voigt
不是很好。然而,积分差别不大。不过,考虑到拟合优化 chi**2
这一事实,这并不是什么大惊喜。
我正在尝试确定曲线(峰值)的面积。我能够使用伪 Voigt 曲线和指数背景成功拟合峰值(数据),并获得与使用商业软件获得的参数一致的拟合参数。现在的问题是试图将那些拟合的峰参数与峰面积相关联。
我找不到使用拟合参数计算峰面积的简单方法,这与高斯线形的情况不同。所以我正在尝试使用 scipy quad 函数来集成我的拟合函数。我知道商业软件确定的面积应该在 19,000 左右,但我得到的值非常大,不正确。
拟合效果很好(通过绘图确认...)但计算面积并不接近。在尝试使用传递给它的最佳拟合值绘制 my psuedo_voigt_func 函数后,我发现它的峰值太强了。这样,集成可能是正确的,那么错误将出现在我如何通过将拟合参数传递给我的 psuedo_voigt_func 函数来创建我的峰值,其中该函数是从 lmfit 模型网站转录的(https://lmfit.github.io/lmfit-py/builtin_models.html).我相信我正确地编写了 psuedo voigt 函数的脚本,但它不起作用。
#modules
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from lmfit.models import GaussianModel, LinearModel, VoigtModel, Pearson7Model, ExponentialModel, PseudoVoigtModel
from scipy.integrate import quad
#data
x = np.array([33.05, 33.1 , 33.15, 33.2 , 33.25, 33.3 , 33.35, 33.4 , 33.45, 33.5 , 33.55, 33.6 , 33.65, 33.7 , 33.75, 33.8 , 33.85, 33.9 , 33.95, 34. , 34.05, 34.1 , 34.15, 34.2 , 34.25, 34.3 , 34.35, 34.4 , 34.45, 34.5 , 34.55, 34.6 , 34.65, 34.7 , 34.75, 34.8 , 34.85, 34.9 , 34.95, 35. , 35.05, 35.1 , 35.15, 35.2 , 35.25, 35.3 , 35.35, 35.4 , 35.45, 35.5 , 35.55, 35.6 , 35.65, 35.7 , 35.75, 35.8 , 35.85, 35.9 , 35.95, 36. , 36.05, 36.1 , 36.15, 36.2 , 36.25, 36.3 , 36.35, 36.4 , 36.45])
y = np.array([4569, 4736, 4610, 4563, 4639, 4574, 4619, 4473, 4488, 4512, 4474, 4640, 4691, 4621, 4671, 4749, 4657, 4751, 4921, 5003, 5071, 5041, 5121, 5165, 5352, 5304, 5408, 5393, 5544, 5625, 5859, 5851, 6155, 6647, 7150, 7809, 9017, 10967, 14122, 19529, 28029, 39535, 50684, 55730, 52525, 41356, 30015, 20345, 14368, 10736, 9012, 7765, 7064, 6336, 6011, 5806, 5461, 5283, 5224, 5221, 4895, 4980, 4895, 4852, 4889, 4821, 4872, 4802, 4928])
#model
bkg_model = ExponentialModel(prefix='bkg_') #BACKGROUND model
peak_model = PseudoVoigtModel(prefix='peak_') #PEAK model
model = peak_model + bkg_model
#parameters
pars = bkg_model.guess(y, x=x) #BACKGROUND parameters
pars.update(peak_model.make_params()) #PEAK parameters
pars['peak_amplitude'].set(value=17791.293, min=0)
pars['peak_center'].set(value=35.2, min=0, max=91)
pars['peak_sigma'].set(value=0.05, min=0)
#fitting
init = model.eval(pars, x=x) #initial parameters
out = model.fit(y, pars, x=x) #fitting
#integration part
def psuedo_voigt_func(x, amp, cen, sig, alpha):
sig_gauss = sig / np.sqrt(2*np.log(2))
term1 = (amp * (1-alpha)) / (sig_gauss * np.sqrt(2*np.pi))
term2 = np.exp(-(x-cen)**2) / (2 * sig_gauss**2)
term3 = ((amp*alpha) / np.pi) * ( sig / ((x-cen)**2) + sig**2)
psuedo_voigt = (term1 * term2) + term3
return psuedo_voigt
fitted_amp = out.best_values['peak_amplitude']
fitted_cen = out.best_values['peak_center']
fitted_sig = out.best_values['peak_sigma']
fitted_alpha = out.best_values['peak_fraction']
print(quad(psuedo_voigt_func, min(x), max(x), args=(fitted_amp, fitted_cen, fitted_sig, fitted_alpha)))
#output result of fit:
[[Model]]
(Model(pvoigt, prefix='peak_') + Model(exponential, prefix='bkg_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 542
# data points = 69
# variables = 6
chi-square = 2334800.79
reduced chi-square = 37060.3300
Akaike info crit = 731.623813
Bayesian info crit = 745.028452
[[Variables]]
bkg_amplitude: 4439.34760 +/- 819.477320 (18.46%) (init = 10.30444)
bkg_decay: -38229822.8 +/- 5.6275e+12 (14720258.94%) (init = -5.314193)
peak_amplitude: 19868.0711 +/- 106.363477 (0.54%) (init = 17791.29)
peak_center: 35.2039076 +/- 3.3971e-04 (0.00%) (init = 35.2)
peak_sigma: 0.14358871 +/- 5.6049e-04 (0.39%) (init = 0.05)
peak_fraction: 0.62733180 +/- 0.01233108 (1.97%) (init = 0.5)
peak_fwhm: 0.28717742 +/- 0.00112155 (0.39%) == '2.0000000*peak_sigma'
peak_height: 51851.3174 +/- 141.903066 (0.27%) == '(((1 peak_fraction)*peak_amplitude)/max(2.220446049250313e-16, (peak_sigma*sqrt(pi/log(2))))+(peak_fraction*peak_amplitude /max(2.220446049250313e-16, (pi*peak_sigma)))'
[[Correlations]] (unreported correlations are < 0.100)
C(bkg_amplitude, bkg_decay) = -0.999
C(peak_amplitude, peak_fraction) = 0.838
C(peak_sigma, peak_fraction) = -0.481
C(bkg_decay, peak_amplitude) = -0.338
C(bkg_amplitude, peak_amplitude) = 0.310
C(bkg_decay, peak_fraction) = -0.215
C(bkg_amplitude, peak_fraction) = 0.191
C(bkg_decay, peak_center) = -0.183
C(bkg_amplitude, peak_center) = 0.183
C(bkg_amplitude, peak_sigma) = 0.139
C(bkg_decay, peak_sigma) = -0.137
#output of integration:
(4015474.293103768, 3509959.3601876567)
C:/Users/script.py:126: IntegrationWarning: The integral is probably divergent, or slowly convergent.
与 lmfit
中的其他峰状线形和模型一样,amplitude
参数应给出该组件的面积。
就其价值而言,最好使用真正的 Voigt 函数而不是伪 Voigt 函数。
OP 的 pseudo_voigt
格式不正确,但似乎也没有错,但是 pseudo_voigt
有不同的定义。下面我从维基百科实现了一个(link 在代码中),它通常会产生很好的结果。然而,从对数尺度来看,这个数据并不是很好。我还使用复杂的定义来获得真正的 Voigtusing Fedeeva
函数,如 lmfit
。
代码如下:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
from scipy.special import wofz
from scipy.integrate import quad
def cauchy(x, x0, g):
return 1. / ( np.pi * g * ( 1 + ( ( x - x0 ) / g )**2 ) )
def gauss( x, x0, s):
return 1./ np.sqrt(2 * np.pi * s**2 ) * np.exp( - (x-x0)**2 / ( 2 * s**2 ) )
# https://en.wikipedia.org/wiki/Voigt_profile#Numeric_approximations
def pseudo_voigt( x, x0, s, g, a, y0 ):
fg = 2 * s * np.sqrt( 2 * np.log(2) )
fl = 2 * g
f = ( fg**5 + 2.69269 * fg**4 * fl + 2.42843 * fg**3 * fl**2 + 4.47163 * fg**2 * fl**3 + 0.07842 * fg * fl**4+ fl**5)**(1./5.)
eta = 1.36603 * ( fl / f ) - 0.47719 * ( fl / f )**2 + 0.11116 * ( f / fl )**3
return y0 + a * ( eta * cauchy( x, x0, f) + ( 1 - eta ) * gauss( x, x0, f ) )
def voigt( x, s, g):
z = ( x + 1j * g ) / ( s * np.sqrt( 2. ) )
v = wofz( z ) #Feddeeva
out = np.real( v ) / s / np.sqrt( 2 * np.pi )
return out
def fitfun( x, x0, s, g, a, y0 ):
return y0 + a * voigt( x - x0, s, g )
if __name__ == '__main__':
xlist = np.array( [ 33.05, 33.1 , 33.15, 33.2 , 33.25, 33.3 , 33.35, 33.4 , 33.45, 33.5 , 33.55, 33.6 , 33.65, 33.7 , 33.75, 33.8 , 33.85, 33.9 , 33.95, 34. , 34.05, 34.1 , 34.15, 34.2 , 34.25, 34.3 , 34.35, 34.4 , 34.45, 34.5 , 34.55, 34.6 , 34.65, 34.7 , 34.75, 34.8 , 34.85, 34.9 , 34.95, 35. , 35.05, 35.1 , 35.15, 35.2 , 35.25, 35.3 , 35.35, 35.4 , 35.45, 35.5 , 35.55, 35.6 , 35.65, 35.7 , 35.75, 35.8 , 35.85, 35.9 , 35.95, 36. , 36.05, 36.1 , 36.15, 36.2 , 36.25, 36.3 , 36.35, 36.4 , 36.45])
ylist = np.array( [ 4569, 4736, 4610, 4563, 4639, 4574, 4619, 4473, 4488, 4512, 4474, 4640, 4691, 4621, 4671, 4749, 4657, 4751, 4921, 5003, 5071, 5041, 5121, 5165, 5352, 5304, 5408, 5393, 5544, 5625, 5859, 5851, 6155, 6647, 7150, 7809, 9017, 10967, 14122, 19529, 28029, 39535, 50684, 55730, 52525, 41356, 30015, 20345, 14368, 10736, 9012, 7765, 7064, 6336, 6011, 5806, 5461, 5283, 5224, 5221, 4895, 4980, 4895, 4852, 4889, 4821, 4872, 4802, 4928])
sol, err = curve_fit( pseudo_voigt, xlist, ylist, p0=[ 35.25,.05,.05, 30000., 3000] )
solv, errv = curve_fit( fitfun, xlist, ylist, p0=[ 35.25,.05,.05, 20000., 3000] )
print solv
xth = np.linspace( xlist[0], xlist[-1], 500)
yth = np.fromiter( ( pseudo_voigt(x ,*sol) for x in xth ), np.float )
yv = np.fromiter( ( fitfun(x ,*solv) for x in xth ), np.float )
print( quad(pseudo_voigt, xlist[0], xlist[-1], args=tuple( sol ) ) )
print( quad(fitfun, xlist[0], xlist[-1], args=tuple( solv ) ) )
solvNoBack = solv
solvNoBack[-1] =0
print( quad(fitfun, xlist[0], xlist[-1], args=tuple( solvNoBack ) ) )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( xlist, ylist, marker='o', linestyle='', label='data' )
ax.plot( xth, yth, label='pseudo' )
ax.plot( xth, yv, label='voigt with hack' )
ax.set_yscale('log')
plt.legend( loc=0 )
plt.show()
提供:
[3.52039054e+01 8.13244777e-02 7.80206967e-02 1.96178358e+04 4.48314849e+03]
(34264.98814344757, 0.00017531957481189617)
(34241.971907301166, 0.0002019796740206914)
(18999.266974139795, 0.0002019796990069267)
从图中可以明显看出 pseudo_voigt
不是很好。然而,积分差别不大。不过,考虑到拟合优化 chi**2
这一事实,这并不是什么大惊喜。