Matlab 和 Python 中 fmin 和 fminsearch 的结果差异

Difference in result between fmin and fminsearch in Matlab and Python

我的 objective 是对某些衰变数据(通过 CPMG 的 NMR T2 衰变)执行逆拉普拉斯变换。为此,我们获得了 CONTIN 算法。这个算法是adapted to Matlab by Iari-Gabriel Marino,而且效果很好。我想将这段代码改编成 Python。问题的核心在于 scipy.optimize.fmin,它没有以任何类似于 Matlab 的 fminsearch 的方式最小化均方差 (MSD)。后者导致了很好的最小化,而前者则没有。

我已经在 Python 和原始 Matlab 中逐行查看了我改编的代码。我检查了每个矩阵和每个输出。我用它来确定临界点在fmin。我还尝试了 scipy.optimize.minimize 和其他最小化算法,但 none 给出了令人满意的结果。

我已经为 Python 和 Matlab 制作了两个 MWE,以使其对所有人都可重现。示例数据是从 matlab 函数的文档中获得的。如果这是很长的代码,我深表歉意,但我真的不知道如何在不牺牲可读性和清晰度的情况下缩短它。我试图让线条尽可能匹配。我在 Windows 8.1 上使用 Python 3.7.3、scipy v1.3.0、numpy 1.16.2、Matlab R2018b。这是相对较新的 Anaconda 安装(<2 个月)。

我的代码:

import numpy as np
from scipy.optimize import fmin
import matplotlib.pyplot as plt

def msd(g, y, A, alpha, R, w, constraints):
    """ msd: mean square deviation. This is the function to be minimized by fmin"""
    if 'zero_at_extremes' in constraints:
        g[0] = 0
        g[-1] = 0
    if 'g>0' in constraints:
        g = np.abs(g)

    r = np.diff(g, axis=0, n=2)
    yfit = A @ g
    # Sum of weighted square residuals
    VAR = np.sum(w * (y - yfit) ** 2)
    # Regularizor
    REG = alpha ** 2 * np.sum((r - R @ g) ** 2)
    # output to be minimized
    return VAR + REG

# Objective: match this distribution
g0 = np.array([0, 0, 10.1625, 25.1974, 21.8711, 1.6377, 7.3895, 8.736, 1.4256, 0, 0]).reshape((-1, 1))
s0 = np.logspace(-3, 6, len(g0)).reshape((-1, 1))
t = np.linspace(0.01, 500, 100).reshape((-1, 1))
sM, tM = np.meshgrid(s0, t)
A = np.exp(-tM / sM)
np.random.seed(1)
# Creates data from the initial distribution with some random noise.
data = (A @ g0) + 0.07 * np.random.rand(t.size).reshape((-1, 1))

# Parameters and function start
alpha = 1E-2  # regularization parameter
s = np.logspace(-3, 6, 20).reshape((-1, 1)) # x of the ILT
g0 = np.ones(s.size).reshape((-1, 1))        # guess of y of ILT
y = data                                    # noisy data
options = {'maxiter':1e8, 'maxfun':1e8}     # for the fmin function
constraints=['g>0', 'zero_at_extremes']     # constraints for the MSD function
R=np.zeros((len(g0) - 2, len(g0)), order='F')  # Regularizor
w=np.ones(y.reshape(-1, 1).size).reshape((-1, 1)) # Weights
sM, tM = np.meshgrid(s, t, indexing='xy')
A = np.exp(-tM/sM)
g0 = g0 * y.sum() / (A @ g0).sum()  # Makes a "better guess" for the distribution, according to algorithm

print('msd of input data:\n', msd(g0, y, A, alpha, R, w, constraints))

for i in range(5):  # Just for testing. If this is extremely high, ~1000, it's still bad.
    g = fmin(func=msd,
             x0 = g0, 
             args=(y, A, alpha, R, w, constraints), 
             **options, 
             disp=True)[:, np.newaxis]
    msdfit = msd(g, y, A, alpha, R, w, constraints)
    if 'zero_at_extremes' in constraints:
            g[0] = 0
            g[-1] = 0
    if 'g>0' in constraints:
            g = np.abs(g)

    g0 = g

print('New guess', g)
print('Final msd of g', msdfit)

# Visualize the fit
plt.plot(s, g, label='Initial approximation')
plt.plot(np.logspace(-3, 6, 11), np.array([0, 0, 10.1625, 25.1974, 21.8711, 1.6377, 7.3895, 8.736, 1.4256, 0, 0]), label='Distribution to match')
plt.xscale('log')
plt.legend()
plt.show()

Matlab:

% Objective: match this distribution
g0 = [0 0 10.1625 25.1974 21.8711 1.6377 7.3895 8.736 1.4256 0 0]';
s0  = logspace(-3,6,length(g0))';
t = linspace(0.01,500,100)';
[sM,tM] = meshgrid(s0,t);
A = exp(-tM./sM);
rng(1);
% Creates data from the initial distribution with some random noise.
data = A*g0 + 0.07*rand(size(t));

% Parameters and function start
alpha = 1e-2; % regularization parameter
s = logspace(-3,6,20)'; % x of the ILT
g0 = ones(size(s)); % initial guess of y of ILT
y = data; % noisy data
options = optimset('MaxFunEvals',1e8,'MaxIter',1e8); % constraints for fminsearch
constraints = {'g>0','zero_at_the_extremes'}; % constraints for MSD
R = zeros(length(g0)-2,length(g0));
w = ones(size(y(:)));
[sM,tM] = meshgrid(s,t);
A = exp(-tM./sM);
g0 = g0*sum(y)/sum(A*g0); % Makes a "better guess" for the distribution

disp('msd of input data:')
disp(msd(g0, y, A, alpha, R, w, constraints))

for k = 1:5
    [g,msdfit] = fminsearch(@msd,g0,options,y,A,alpha,R,w,constraints);
    if ismember('zero_at_the_extremes',constraints)
        g(1) = 0;
        g(end) = 0;
    end
    if ismember('g>0',constraints)
        g = abs(g);
    end
    g0 = g;
end

disp('New guess')
disp(g)
disp('Final msd of g')
disp(msdfit)

% Visualize the fit
semilogx(s, g)
hold on
semilogx(logspace(-3,6,11), [0 0 10.1625 25.1974 21.8711 1.6377 7.3895 8.736 1.4256 0 0])
legend('First approximation', 'Distribution to match')
hold off
function out = msd(g,y,A,alpha,R,w,constraints)
% msd: The mean square deviation; this is the function
% that has to be minimized by fminsearch

% Constraints and any 'a priori' knowledge
if ismember('zero_at_the_extremes',constraints)
    g(1) = 0;
    g(end) = 0;
end
if ismember('g>0',constraints)
    g = abs(g); % must be g(i)>=0 for each i
end
r = diff(diff(g(1:end))); % second derivative of g
yfit = A*g;
% Sum of weighted square residuals
VAR = sum(w.*(y-yfit).^2);
% Regularizor
REG = alpha^2 * sum((r-R*g).^2);
% Output to be minimized
out = VAR+REG;
end

这里是Python中的优化

这是Matlab中的优化

我在启动前查看了g0的MSD输出,都给出了2651的值。最小化后,Python上升到4547,Matlab下降到0.1381。

我认为问题是以下之一。它在我的实现中,也就是说,我使用了 fmin 错误,或者有一些其他段落我错了,但我无法弄清楚是什么。 MSD 在应该使用最小化函数降低的情况下却增加了这一事实是该死的。阅读文档,scipy 实现与 Matlab 的不同(他们使用 Lagarias 中描述的 Nelder Mead 方法,per their documentation),而 scipy 使用原始的 Nelder Mead)。也许这会影响很大?或者也许我最初的猜测对于 scipy 的算法来说太糟糕了?

所以,我发布这篇文章已经有很长时间了,但我想分享我最终学习和做的事情。

CPMG 数据的拉普拉斯逆变换有点用词不当,更恰当地称为反转。一般问题是求解第一类 Fredholm 积分。这样做的一种方法是 Tikhonov 正则化方法。事实证明,你可以使用 numpy 很容易地描述这个问题,并用 scipy 包解决它,所以我不必用这个“重新发明”轮子。

我使用了显示的解决方案 in this post,此处的名称反映了该解决方案。

def tikhonov_regularized_inversion(
    kernel: np.ndarray, alpha: float, data: np.ndarray
) -> np.ndarray:
    data = data.reshape(-1, 1)
    I = alpha * np.eye(*kernel.shape)
    C = np.concatenate([kernel, I], axis=0)
    d = np.concatenate([data, np.zeros_like(data)])
    x, _ = nnls(C, d.flatten())

这里,kernel是一个包含所有可能的指数衰减曲线的矩阵,我的解法判断每条衰减曲线在我收到的数据中的贡献。首先,我将数据堆叠成一列,然后用零填充它,创建向量 d。然后,我将我的内核堆叠在一个对角矩阵的顶部,该矩阵包含沿对角线的正则化参数 alpha,其大小与内核相同。最后,我将方便的 nnls 称为 scipy.optimize 中的非负最小二乘求解器。这是因为没有理由有负面贡献,只有没有贡献。

解决了我的问题,又快又方便。