用 least_squares() 拟合一组具有不同模型的 sigmoids？

Question

对于在不同时间点测量的一系列实验，我尝试使用信息标准（如 Akaike IC）比较所有实验的固定参数拟合与每个实验具有一组单独参数的拟合. scipy.optimize.least_squares 可以吗？

假设测量值大致遵循 S 形曲线，参数为 base（步前水平），height（步后水平增加），stretch（持续时间步骤）和t50（级别为base + height/2的时间）。我在下面的代码中对这些进行建模，其中 returns 大小 [t, c] numpy ndarray measurements 具有 t 时间点（实验）和每个时间点的 c 值：

import numpy as np;
import matplotlib.pyplot as plt;

# lifted, scaled, stretched and shifted sigmoid
def lssig ( t, base, height, stretch, t50 ):
return base + height / ( 1 + np.exp ( -1/stretch * ( t - t50 ) ) );

# number of curves per experiment
numcurves = 10;

# number of experiments (time pts)
num_experiments = 20;

# timepoints (with simulated variability in timing)
time_jitter = 2;
tstart  = np.linspace   ( -10, 30, num = num_experiments );
tstart += time_jitter * np.random.rand ( num_experiments );

# measurements + added noise (this is added noise, not the variability in the estimates)
measurements  = np.ndarray       ( shape = ( num_experiments, numcurves ) );
measure_noise = np.random.normal ( 0, .2,  ( num_experiments, numcurves ) );

# Variability of model parameters between curves:   
# par_spreading [0] -> variability in base (start value)
# par_spreading [1] -> variability in height (end - start)
# par_spreading [2] -> variability in stretch (duration of step)
# par_spreading [3] -> variability in t50 (middle of step)
par_spreading = np.ones ( 4 );

# parameters of the model:   
# every curve can have its own parameters, variability between curves 
# is given by the par_spreading value (see above) for that parameter    
height  = 8 * np.ones ( numcurves ) + par_spreading [0] * np.random.rand ( numcurves);
base    = 2 * np.ones ( numcurves ) + par_spreading [1] * np.random.rand ( numcurves);
stretch = 4 * np.ones ( numcurves ) + par_spreading [2] * np.random.rand ( numcurves);
t50     = 9 * np.ones ( numcurves ) + par_spreading [2] * np.random.rand ( numcurves);

# fill the measurement array
for t in range(num_experiments):
    for c in range (numcurves):
    
        measurements[ t, c ] = lssig ( tstart [t],
                                     base  [c], 
                                     height   [c],
                                     stretch   [c],
                                     t50 [c],
                                     );
measurements += measure_noise;
   
# plot curves per subject
plt.plot ( tstart, measurements );
plt.show ();

有一个很不错的page，里面有使用的例子least_squares，其中拟合了1个sigmoid，sigmoid的参数在数组theta中传递。使用该配方，我可以在每个时间点拟合第一个实验的曲线 measurements[:,0]:

from scipy.optimize import least_squares;

# point on the sigmoid, using array theta for parameters
def y ( theta, t ):
    return theta[0] + theta[1] / ( 1 + np.exp ( -1/theta[2] * ( t - theta[3] ) ) );

# objective function for least_squares: difference between noisy data & model
def fun(theta):
    return y(theta, tstart) - measurements[:,0];

# start with values for base, step, stretch and t50
theta0 = [8,2,4,9];
res1   = least_squares(fun, theta0);

# show the parameters & plot the functions
print      ( 'real: {}, estimated: {}'.format ( [ base[0], height[0], stretch[0], t50[0] ],             res1.x ) );
#
plt.plot   (  tstart, measurements[:,0],  tstart, y(res1.x,tstart) );
plt.legend ( ('noisy', 'lsq' ) );
plt.show   ( );

我在上面的代码中使用 4 个单独参数的原因是 base 可以是单个标量（每个实验的所有曲线都相同）或数组（每条曲线都有一个单独的 base 值）。 least_squares 可以这样使用吗？还是所有的参数都需要放在一维数组中？

可以展平参数数组（如 theta），但这会变得异常混乱，因为在每个实验都有自己的 base 参数的模型中，第二个参数将是base 的一个值，但在所有实验都有一个 base 参数的模型中，数组中的第二个值将是 height 的值，依此类推。

理想情况下，参数仍然是分开的，因此它们的长度可以独立变化。是否有一种管理上可追溯/美观的方式来做到这一点？

Answer 1

也许您可以通过提供一个最小可重现示例来大大缩短问题的长度并提高可读性！ ;)

如果我理解你的问题：

为了避免处理“混乱”的参数集，我经常创建两个小函数，例如 param2minimizer 及其相反的 minimizer2param。这些函数负责组织最小化器的所有参数。这样代码就更容易理解了。

例如，假设您要优化任意数量的 [base,height] 对

def param2minimizer(bases,heights):
    return np.concatenate((bases,heights))

def minimizer2param(m):
    L = len(m)
    if L%2: raise ValueError("Incompatible length")
    bases = m[0:L//2]
    heights = m[L//2:]
    return bases, heights

您可以修改这两个函数以完全符合您的需要。

关于最小化变量维数的问题，你可以看看documentation of least_squares。我认为系统地给出向量 (1D) 而不是矩阵 (2D) 是一个更好的习惯，即使特定的最小化器是 2D 许可的。一些最小化器仅是一维许可的，而且优化算法在理论上通常在一维变量向量上进行数值描述。使用1D更安全，更符合用法

用 least_squares() 拟合一组具有不同模型的 sigmoids？

fitting a group of sigmoids with different models with least_squares()?

python

multidimensional-array

least-squares

sigmoid

scipy-optimize