用 curve_fit 限制高斯拟合

Confine a gaussian fit with curve_fit

在我的学士论文框架内,我需要用python评估我的数据。不幸的是,我的同学们还没有合适的脚本,而且我对编程还很陌生。

我有这个数据集,我正在尝试使用 scipy.optimize.curve_fit 对其进行高斯拟合。由于有很多无法使用的计数,尤其是在轴的末端,我想限制要安装的部分。

图片raw data

这是我目前所拥有的:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

x=np.arange(5120)
y=array([  0.81434599,   1.17054264,   0.85279188, ...,   1.        ,
     1.        ,  13.56291391])   #most of the data isn't interesting 
#to me, part of interest see below 

def Gauss(x, a, x0, sigma):
    return a * np.exp(-(x - x0)**2 / (2 * sigma**2))

mean = sum(x * y) / sum(y)
sigma = np.sqrt(sum(y * (x - mean)**2) / sum(y))

popt,pcov = curve_fit(Gauss, x, y, p0=[max(y), mean, sigma], 
maxfev=360000)

plt.plot(x,y,label='data')
plt.plot(x,Gauss(x, *popt), 'r-',label='fit')

在 docs.scipy.org 我找到了关于 curve_fit

的一般描述

如果我尝试使用 bounds=([2400,-np.inf, -np.inf],[2600, np.inf, np.inf]), 我收到 ValueError:x0 不可行。这里有什么问题?

我也试过用 popt,pcov = curve_fit(Gauss, x[2400:2600], y[2400:2600], p0=[max(y), mean, sigma], maxfev=360000) 正如对此问题的评论中所建议的: "Error when obtaining gaussian fit for graph" at Whosebug 在这种情况下,我只能得到一条直线。

图片:Confinement with x[2400:2600],y[2400:2600] as arguments of curve_fit

我真的希望你能帮我解决这个问题。我只需要一种方法来适应我的一小部分数据。提前致谢!

有趣的数据:

y=array([ 0.93396226,  1.00884956,  1.15457413,  1.07590759,  
0.88915094, 1.07142857,  1.10714286,  1.14171123,  1.06666667,  
0.84975369, 0.95480226,  0.99388379,  1.01675978,  0.83967391,  
0.9771987 , 1.02402402,  1.04531722,  1.07492795,  0.97135417,  
0.99714286, 1.0248139 ,  1.26223776,  1.1533101 ,  0.99099099,  
1.18867925, 1.15772871,  0.95076923,  1.03313253,  1.02278481,  
0.93265993, 1.06705539,  1.00265252,  1.02023121,  0.92076503,  
0.99728997, 1.03353659,  1.15116279,  1.04336043,  0.95076923,  
1.05515588, 0.92571429,  0.93448276,  1.02702703,  0.90056818,  
0.96068796, 1.08493151,  1.13584906,  1.1212938 ,  1.0739645 ,  
0.98972603, 0.94594595,  1.07913669,  0.98425197,  0.87762238,  
0.96811594, 1.02710843,  0.99392097,  0.91384615,  1.09809264,  
1.00630915, 0.93175074,  0.87572254,  1.00651466,  0.78772379,  
1.12244898, 1.2248062 ,  0.97109827,  0.94607843,  0.97900262,  
0.97527473, 1.01212121,  1.16422287,  1.20634921,  0.97275204,  
1.01090909, 0.99404762,  1.00561798,  1.01146132,  1.08695652,  
0.97214485, 1.03525641,  0.99096386,  1.05135952,  1.16451613,  
0.90462428, 0.76876877,  0.47701149,  0.27607362,  0.21580547,  
0.20598007, 0.16766467,  0.15533981,  0.19745223,  0.15407855,  
0.18925831, 0.26997245,  0.47603834,  0.596875  ,  0.85126582,  0.96      
, 1.06578947,  1.08761329,  0.89548023,  0.99705882,  1.07142857,
0.95677233,  0.86119874,  1.02857143,  0.98250729,  0.94214876,
1.04166667,  0.96024465,  1.07022472,  1.10344828,  1.04859335,
0.96655518,  1.06424581,  1.01754386,  1.03492063,  1.18627451,
0.91036415,  1.03355705,  1.09116809,  0.96083551,  1.01298701,
1.03691275,  1.02923977,  1.11612903,  1.01457726,  1.06285714,
0.98186528,  1.16470588,  0.86645963,  1.07317073,  1.09615385,
1.21192053,  0.94385027,  0.94244604,  0.88390501,  0.95718654,
0.9691358 ,  1.01729107,  1.01119403,  1.20350877,  1.12890625,
1.06940063,  0.90410959,  1.14662757,  0.97093023,  1.03021148,
1.10629921,  0.97118156,  1.10693642,  1.07917889,  0.9484127 ,
1.07581227,  0.98006645,  0.98986486,  0.90066225,  0.90066225,
0.86779661,  0.86779661,  0.96996997,  1.01438849,  0.91186441,
0.91290323,  1.03745318,  1.0615942 ,  0.97202797,  1.16608997,
0.94182825,  1.08333333,  0.9076087 ,  1.18181818,  1.20618557,
1.01273885,  0.93606138,  0.87457627,  0.90575916,  1.09756098,
0.99115044,  1.13380282,  1.04333333,  1.04026846,  1.0297619 ,
1.04334365,  1.03395062,  0.92553191,  0.98198198,  1.        ,
0.9439528 ,  1.02684564,  1.1372549 ,  0.96676737,  0.99649123,
1.07051282,  1.10367893,  1.0866426 ,  1.15384615,  0.99667774])

您可能会发现 lmfit 模块 (https://lmfit.github.io/lmfit-py/) 对此很有用。它旨在使曲线拟合变得非常容易,具有内置的常见峰模型(如高斯),并具有许多有用的功能,例如允许您设置参数范围。使用 lmfit 拟合您的数据可能如下所示:

import numpy as np
import matplotlib.pyplot as plt

from lmfit.models import GaussianModel, ConstantModel

y = np.array([.....])   # uses your shorter data range
x = np.arange(len(y))

# make a model that is a Gaussian + a constant:
model = GaussianModel(prefix='peak_') + ConstantModel()

# make parameters with starting values:
params = model.make_params(c=1.0, peak_center=90, 
                           peak_sigma=5, peak_amplitude=-5)

# it's not really needed for this data, but you can put bounds on 
# parameters like this (or set .vary=False to fix a parameter)
params['peak_sigma'].min = 0         # sigma  > 0
params['peak_amplitude'].max = 0     # amplitude < 0
params['peak_center'].min = 80
params['peak_center'].max = 100

# run fit
result = model.fit(y, params, x=x)

# print, plot results
print(result.fit_report())
plt.plot(x, y)
plt.plot(x, result.best_fit)
plt.show()

这将打印出来

[[Model]]
    (Model(gaussian, prefix='peak_') + Model(constant))
[[Fit Statistics]]
    # function evals   = 54
    # data points      = 200
    # variables        = 4
    chi-square         = 1.616
    reduced chi-square = 0.008
    Akaike info crit   = -955.625
    Bayesian info crit = -942.432
[[Variables]]
    peak_sigma:       4.03660814 +/- 0.204240 (5.06%) (init= 5)
    peak_center:      91.2246614 +/- 0.200267 (0.22%) (init= 90)
    peak_amplitude:  -9.79111362 +/- 0.445273 (4.55%) (init=-5)
    c:                1.02138228 +/- 0.006796 (0.67%) (init= 1)
    peak_fwhm:        9.50548558 +/- 0.480950 (5.06%)  == '2.3548200*peak_sigma'
    peak_height:     -0.96766623 +/- 0.041854 (4.33%)  == '0.3989423*peak_amplitude/max(1.e-15, peak_sigma)'
[[Correlations]] (unreported correlations are <  0.100)
    C(peak_sigma, peak_amplitude)  = -0.599 
    C(peak_amplitude, c)         = -0.328 
    C(peak_sigma, c)             =  0.196 

然后画这样的图: