如何根据 pcov 值计算曲线拟合的 95% 置信区间？

Question

我现在使用 sigma_ab 打印的值是多少？如何计算 95 处的置信区间？

for g in all:
    c0 = 5
    c2 = 0.2
    c3 = 0.7
    start = g['y'].iloc[0]
    
    p0 = np.array([c0, c2, c3]), # Construct initial guess array

    popt, pcov = curve_fit(
         model, g['x'], g['y'],
         absolute_sigma=True, maxfev=100000
    )
    
    sigma_ab = np.sqrt(np.diagonal(pcov))
    n = g.name
    print(n+' Estimated parameters: \n', popt)
    print(n + ' Approximated errors: \n', sigma_ab)

这些是估计的参数

[0.24803625 0.06072472 0.46449578]

这是sigma_ab，但我不知道它到底是什么。我想计算95%置信区间均值的上下限

[1.32778766 0.64261562 1.47915215]

Answer 1

您的 sigma_ab（协方差对角线元素的平方）将是 1-sigma (68.3%) 不确定性。如果你的不确定性分布是严格的高斯分布（通常是一个好的但不完美的假设，所以可能是“一个不错的初始估计”），那么 2-sigma (95.5%) 的不确定性将是这些值的两倍。

如果您想要更详细的度量（并且不假设对称不确定性的度量），您可能会发现 lmfit 及其 Model class 很有帮助。默认情况下（并且在可能的情况下）它会报告来自协方差的 1-sigma 不确定性，这很快，而且通常非常好。它还可以明确地分别找到 1-、2-、3-sigma 不确定性、正负不确定性。你没有给出一个非常完整的例子，所以很难说出你的模型函数在做什么。如果你有一个像这样的模型函数：

def modelfunc(x, amp, cen, sigma):
    return amp * np.exp(-(x-cen)*(x-cen)/sigma**2)

你可以使用

import numpy as np
import lmfit

def modelfunc(x, amp, cen, sigma):
    return amp * np.exp(-(x-cen)*(x-cen)/sigma**2)

x = np.linspace(-10.0, 10.0, 201)
y = modelfunc(x, 3.0, 0.5, 1.1) + np.random.normal(scale=0.1, size=len(x))

model = lmfit.Model(modelfunc)
params = model.make_params(amp=5., cen=0.2, sigma=1)

result = model.fit(y, params, x=x)
print(result.fit_report())

# now calculate explicit 1-, 2, and 3-sigma uncertainties:
ci = result.conf_interval(sigmas=[1,2,3])
lmfit.printfuncs.report_ci(ci)

这将打印出来

[[Model]]
    Model(modelfunc)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 21
    # data points      = 201
    # variables        = 3
    chi-square         = 1.93360112
    reduced chi-square = 0.00976566
    Akaike info crit   = -927.428077
    Bayesian info crit = -917.518162
[[Variables]]
    amp:    2.97351225 +/- 0.03245896 (1.09%) (init = 5)
    cen:    0.48792611 +/- 0.00988753 (2.03%) (init = 0.2)
    sigma:  1.10931408 +/- 0.01398308 (1.26%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
    C(amp, sigma) = -0.577
          99.73%    95.45%    68.27%    _BEST_    68.27%    95.45%    99.73%
 amp  :  -0.09790  -0.06496  -0.03243   2.97351  +0.03255  +0.06543  +0.09901
 cen  :  -0.03007  -0.01991  -0.00992   0.48793  +0.00991  +0.01990  +0.03004
 sigma:  -0.04151  -0.02766  -0.01387   1.10931  +0.01404  +0.02834  +0.04309

它给出了明确计算的不确定性，并表明 - 对于这种情况 - 1-sigma 不确定性的非常快速估计非常好，2-sigma 非常接近 1-sigma 值的 2 倍。就像，你不应该真的相信超过第二个有效数字......

最后，在你的例子中，你实际上并没有传递你的初始值，这说明了 curve_fit 中一个非常严重的缺陷。

如何根据 pcov 值计算曲线拟合的 95% 置信区间？

how can I calculate in curve fit the 95% confidence interval from the pcov values?

python

curve-fitting

confidence-interval