curve_fit 应用于离散数据集问题

Question

我正在尝试使用 curve_fit 包将一些（少数）离散实验值与自定义模型相匹配。问题是我收到警告 (?)：“OptimizeWarning：无法估计参数的协方差”，当然没有可靠的参数值。

我读到这个问题是我的数据集的离散性质造成的，我可以使用 LMFIT 包解决它。根据我找到的一些例子，我应该定义一个线性space，然后将我的实验值分配给相应的x点。不幸的是，由于我的点数很少，这个过程会引入太多错误。所以我想知道是否有办法用 curve_fit 包来解决这个问题。我在相同的代码中使用它来适应指数模型其他数据（相同数量的元素）没有任何问题。

感谢您的任何提示在细节上，减少代码到本质：

xa= 阵列（[0.5，0.53，0.56，0.59，0.62，0.65，0.68，0.7，0.72，0.74，0.76， 0.78, 0.8, 0.82], dtype=对象)

你= 阵列（[0.40168，0.40103999999999995，0.40027999999999997，0.39936， 0.39828、0.397、0.39544、0.39424000000000003、0.39292、0.39144、 0.38976, 0.38788, 0.38580000000000003, 0.38348], dtype=对象)

from scipy.optimize import curve_fit

def fit_model(x, a, b):
    return (1 + np.exp((a - 0.57)/b))/(1 + np.exp((a-x)/b))

popt_an, pcov_an = curve_fit(fit_model, xa, ya, maxfev=100000)

Answer 1

看来fit_model无法调整数据

我会让 fit_model 完全适合第一个数据点 (0.5, 0.40168) 并使指数 (1 + np.exp((a - x)/b)) 随着 x (1 + np.exp((a + x)/b)) 增加，所以 fit_model 减少x 与输入数据相同。

from numpy import array
import numpy as np

xa= array([0.5, 0.53, 0.56, 0.59, 0.62, 0.65, 0.68, 0.7, 0.72, 0.74,
0.76, 0.78, 0.8, 0.82], dtype=object)

ya= array([0.40168, 0.40103999999999995, 0.40027999999999997, 0.39936,
0.39828, 0.397, 0.39544, 0.39424000000000003, 0.39292, 0.39144, 0.38976, 0.38788, 0.38580000000000003, 0.38348], dtype=object)

from scipy.optimize import curve_fit

def fit_model(x, a, b):
    return (1 + np.exp((a + xa[0])/b))/(1 + np.exp((a + x)/b)) + (ya[0] - 1)

popt_an, pcov_an = curve_fit(fit_model, xa, ya, maxfev=100000)

我得到的解法：

a = -1.47015573
b = 0.17030011

yp = array([0.40168   , 0.40103595, 0.40026891, 0.39935567, 0.39826869,
   0.39697541, 0.3954374 , 0.39425403, 0.39292656, 0.39143789,
   0.38976906, 0.38789897, 0.38580429, 0.38345918])

Answer 2

我不明白在此处使用 lmfit 时存在的问题。我也不明白这里使用“对象数组”。我可能会把你的硬连线 non-x-dependent 因子称为它自己的变量（比如，'c'）并使用这个：

import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model

xa = np.array([0.5, 0.53, 0.56, 0.59, 0.62, 0.65, 0.68, 0.7, 0.72, 0.74,
               0.76, 0.78, 0.8, 0.82])
ya = np.array([0.40168, 0.40103999999999995, 0.40027999999999997, 0.39936,
               0.39828, 0.397, 0.39544, 0.39424000000000003, 0.39292,
               0.39144, 0.38976, 0.38788, 0.38580000000000003, 0.38348])

def modelfunc(x, a, b, c):
    return (1 + c)/(1 + np.exp((a-x)/b))

my_model = Model(modelfunc)
params = my_model.make_params(a=1, b=-0.1, c=-0.5)
result = my_model.fit(ya, params, x=xa)

print(result.fit_report())

plt.plot(xa, ya, label='data')
plt.plot(xa, result.best_fit, label='fit')
plt.legend()
plt.show()

将打印出

的报告

[[Model]]
    Model(modelfunc)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 29
    # data points      = 14
    # variables        = 3
    chi-square         = 7.6982e-10
    reduced chi-square = 6.9984e-11
    Akaike info crit   = -324.734898
    Bayesian info crit = -322.817726
[[Variables]]
    a:  1.29660513 +/- 6.9684e-04 (0.05%) (init = 1)
    b: -0.16527738 +/- 2.7098e-04 (0.16%) (init = -0.1)
    c: -0.59507868 +/- 1.6502e-05 (0.00%) (init = -0.5)
[[Correlations]] (unreported correlations are < 0.100)
    C(a, b) = -0.995
    C(b, c) = -0.955
    C(a, c) =  0.925

并显示如下图：

curve_fit 应用于离散数据集问题

curve_fit applied to a discrete dataset issue

python

numpy

curve-fitting