lmfit 中的缩放误差和信息标准

Question

我正在使用 lmfit 解决非线性优化问题。它工作正常，我试图将测量误差作为因变量 y (sigma_y) 的标准差来实现。我的问题是，我无法正确解释信息标准。在实施 return (model - y)/(sigma_y) 时，它们只是从非常低的值提高到非常高的值。

即[左：return (model - y)-加权->右：return (model - y)/(sigma_y)]：

卡方 0.00159805 -> 47.3184972
减少卡方 1.7756e-04 -> 5.25761080 expectation value is 1 ||
赤池信息暴击 -93.2055413 -> 20.0490661 the more negative, the better
贝叶斯信息暴击 -92.4097507 -> 20.8448566 the more negative, the better

我的猜测是，这在某种程度上与 lmfit 的错误使用（信息标准的错误计算、错误的错误缩放）或普遍缺乏对这些标准的理解有关（对我来说，减少了 5.258 的卡方（低于-estimated) 或 1.776e-4 (over-estimated) 听起来很不合适，但残差图等对我来说看起来还不错...)

这是我重现问题的示例代码：

import lmfit
import numpy as np
import matplotlib.pyplot as plt

y = np.array([1.42774324, 1.45919099, 1.58891696, 1.78432768, 1.97403125,
       2.17091161, 2.02065766, 1.83449279, 1.64412613, 1.47265405,
       1.4507    ])
sigma_y = np.array([0.00312633, 0.00391546, 0.00873894, 0.01252675, 0.00639111,
       0.00452488, 0.0050566 , 0.01127185, 0.01081748, 0.00227587,
       0.00190191])
x = np.array([0.02372331, 0.07251188, 0.50685822, 1.30761963, 2.10535442,
       2.90597497, 2.30733552, 1.50906939, 0.7098041 , 0.09580686,
       0.04777082])
offset = np.array([1.18151707, 1.1815602 , 1.18202847, 1.18187962, 1.18047493,
       1.17903493, 1.17962602, 1.18141625, 1.18216907, 1.18222051,
       1.18238949])
parameter = lmfit.Parameters()
parameter.add('par_1', value = 1e-5)
parameter.add('par_2', value = 0.18)

def U_reg(parameter, x, offset):
    par_1 = parameter['par_1']
    par_2 = parameter['par_2']
    model = offset + 0.03043211217 * np.arcsinh(x / (2 * par_1)) + par_2 * x
    return (model - y)/(sigma_y)

mini = lmfit.Minimizer(U_reg, parameter, fcn_args=(x, offset), nan_policy='omit', scale_covar=False)
regr1 = mini.minimize(method='least_sq') #Levenberg-Marquardt
print("Levenberg-Marquardt:")
print(lmfit.fit_report(regr1, min_correl=0.9))

def U_plot(parameter, x, offset):
    par_1 = parameter['par_1']
    par_2 = parameter['par_2']
    model = offset + 0.03043211217 * np.arcsinh(x / (2 * par_1)) + par_2 * x
    return model - y

plt.title("Model vs Data")
plt.plot(x, y, 'b')
plt.plot(x, U_plot(regr1.params, x, offset) + y, 'r--', label='Levenberg-Marquardt')
plt.ylabel("y")
plt.xlabel("x")
plt.legend(loc='best')
plt.show()

我知道通过 lmfit.Model 实现加权可能更方便，因为有多个回归变量，我想保留 lmfit.Minimizer 方法。我的python版本是3.8.5，lmfit是1.0.2

Answer 1

嗯，为了使卡方的大小有意义（例如，它在 (N_data - N_varys 左右），不确定性的尺度必须正确-- 给出每个观察值的 1-sigma 标准差。

lmfit 不太可能检测您使用的 sigma 是否具有正确的比例。

lmfit 中的缩放误差和信息标准

Scaling error and Information cirteria in lmfit

python

optimization

weighted

python-3.x

lmfit