无法执行赋值，因为左侧的大小为 1×7×7，右侧的大小为 6×6

Question

我正在寻找一种方法来为 2 个给定矩阵找到相同的特征向量，这样我就可以进行联合对角化。为此，我发现并尝试使用以下函数中的 qndiag（来自 https://github.com/pierreablin/qndiag.git ）：

function [D, B, infos] = qndiag(C, varargin)
% Joint diagonalization of matrices using the quasi-Newton method
%
% The algorithm is detailed in:
%
%    P. Ablin, J.F. Cardoso and A. Gramfort. Beyond Pham’s algorithm
%    for joint diagonalization. Proc. ESANN 2019.
%    https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2019-119.pdf
%    https://hal.archives-ouvertes.fr/hal-01936887v1
%    https://arxiv.org/abs/1811.11433
%
% The function takes as input a set of matrices of size `(p, p)`, stored as
% a `(n, p, p)` array, `C`. It outputs a `(p, p)` array, `B`, such that the
% matrices `B * C(i,:,:) * B'` are as diagonal as possible.
%
% There are several optional parameters which can be provided in the
% varargin variable.
%
% Optional parameters:
% --------------------
% 'B0'                        Initial point for the algorithm.
%                             If absent, a whitener is used.
% 'weights'                   Weights for each matrix in the loss:
%                             L = sum(weights * KL(C, C')).
%                             No weighting (weights = 1) by default.
% 'maxiter'                   (int) Maximum number of iterations to perform.
%                             Default : 1000
%
% 'tol'                       (float) A positive scalar giving the tolerance at
%                             which the algorithm is considered to have converged.
%                             The algorithm stops when  |gradient| < tol.
%                             Default : 1e-6
%
% lambda_min                  (float) A positive regularization scalar. Each
%                             eigenvalue of the Hessian approximation below
%                             lambda_min is set to lambda_min.
%
% max_ls_tries                (int), Maximum number of line-search tries to
%                             perform.
%
% return_B_list               (bool) Chooses whether or not to return the list
%                              of iterates.
%
% verbose                     (bool) Prints informations about the state of the
%                             algorithm if True.
%
% Returns
% -------
% D : Set of matrices jointly diagonalized
% B : Estimated joint diagonalizer matrix.
% infos : structure containing monitoring informations, containing the times,
%     gradient norms and objective values.
%
% Example:
% --------
%
%  [D, B] = qndiag(C, 'maxiter', 100, 'tol', 1e-5)
%
% Authors: Pierre Ablin <pierre.ablin@inria.fr>
%          Alexandre Gramfort <alexandre.gramfort@inria.fr>
%
% License: MIT
% First tests
if nargin == 0,
    error('No signal provided');
end
if length(size(C)) ~= 3,
    error('Input C should be 3 dimensional');
end
if ~isa (C, 'double'),
  fprintf ('Converting input data to double...');
  X = double(X);
end
% Default parameters
C_mean = squeeze(mean(C, 1));
[p, d] = eigs(C_mean);
p = fliplr(p);
d = flip(diag(d));
B = p' ./ repmat(sqrt(d), 1, size(p, 1));
max_iter = 1000;
tol = 1e-6;
lambda_min = 1e-4;
max_ls_tries = 10;
return_B_list = false;
verbose = false;
weights = [];
% Read varargin
if mod(length(varargin), 2) == 1,
    error('There should be an even number of optional parameters');
end
for i = 1:2:length(varargin)
    param = lower(varargin{i});
    value = varargin{i + 1};
    switch param
        case 'B0'
            B0 = value;
        case 'max_iter'
            max_iter = value;
        case 'tol'
            tol = value;
        case 'weights'
            weights = value / mean(value(:));
        case 'lambda_min'
            lambda_min = value;
        case 'max_ls_tries'
            max_ls_tries = value;
        case 'return_B_list'
            return_B_list = value;
        case 'verbose'
            verbose = value;
        otherwise
            error(['Parameter ''' param ''' unknown'])
    end
end
[n_samples, n_features, ~] = size(C);
D = transform_set(B, C, false);
current_loss = NaN;
% Monitoring
if return_B_list
    B_list = []
end
t_list = [];
gradient_list = [];
loss_list = [];
if verbose
    print('Running quasi-Newton for joint diagonalization');
    print('iter | obj | gradient');
end
for t=1:max_iter
    if return_B_list
        B_list(k) = B;
    end
    diagonals = zeros(n_samples, n_features);
    for k=1:n_samples
        diagonals(k, :) = diag(squeeze(D(k, :, :)));
    end
    % Gradient
    if isempty(weights)
        G = squeeze(mean(bsxfun(@rdivide, D, ...
                                reshape(diagonals, n_samples, n_features, 1)), ...
                         1)) - eye(n_features);
    else
        G = squeeze(mean(...
                bsxfun(@times, ...
                        reshape(weights, n_samples, 1, 1), ...
                        bsxfun(@rdivide, D, ...
                               reshape(diagonals, n_samples, n_features, 1))), ...
                    1)) - eye(n_features);
    end
    g_norm = norm(G);
    if g_norm < tol
        break
    end
    % Hessian coefficients
    if isempty(weights)
        h = mean(bsxfun(@rdivide, ...
                        reshape(diagonals, n_samples, 1, n_features), ...
                        reshape(diagonals, n_samples, n_features, 1)), 1);
    else
        h = mean(bsxfun(@times, ...
                        reshape(weights, n_samples, 1, 1), ...
                        bsxfun(@rdivide, ...
                                reshape(diagonals, n_samples, 1, n_features), ...
                                reshape(diagonals, n_samples, n_features, 1))), ...
                 1);
    end
    h = squeeze(h);
    % Quasi-Newton's direction
    dt = h .* h' - 1.;
    dt(dt < lambda_min) = lambda_min;  % Regularize
    direction = -(G .* h' - G') ./ dt;
    % Line search
    [success, new_D, new_B, new_loss, direction] = ...
        linesearch(D, B, direction, current_loss, max_ls_tries, weights);
    D = new_D;
    B = new_B;
    current_loss = new_loss;
    % Monitoring
    gradient_list(t) = g_norm;
    loss_list(t) = current_loss;
    if verbose
        print(sprintf('%d  - %.2e - %.2e', t, current_loss, g_norm))
    end
end
infos = struct();
infos.t_list = t_list;
infos.gradient_list = gradient_list;
infos.loss_list = loss_list;
if return_B_list
    infos.B_list = B_list
end
end
function [op] = transform_set(M, D, diag_only)
    [n, p, ~] = size(D);
    if ~diag_only
        op = zeros(n, p, p);
        for k=1:length(D)
            op(k, :, :) = M * squeeze(D(k, :, :)) * M';
        end
    else
        op = zeros(n, p);
        for k=1:length(D)
            op(k, :) = sum(M .* (squeeze(D(k, :, :)) * M'), 1);
        end
    end
end
function [v] = slogdet(A)
    v = log(abs(det(A)));
end
function [out] = loss(B, D, is_diag, weights)
    [n, p, ~] = size(D);
    if ~is_diag
        diagonals = zeros(n, p);
        for k=1:n
            diagonals(k, :) = diag(squeeze(D(k, :, :)));
        end
    else
        diagonals = D;
    end
    logdet = -slogdet(B);
    if ~isempty(weights)
        diagonals = bsxfun(@times, diagonals, reshape(weights, n, 1));
    end
    out = logdet + 0.5 * sum(log(diagonals(:))) / n;
end
function [success, new_D, new_B, new_loss, delta] = linesearch(D, B, direction, current_loss, n_ls_tries, weights)
    [n, p, ~] = size(D);
    step = 1.;
    if current_loss == NaN
        current_loss = loss(B, D, false);
    end
    success = false;
    for n=1:n_ls_tries
        M = eye(p) + step * direction;
        new_D = transform_set(M, D, true);
        new_B = M * B;
        new_loss = loss(new_B, new_D, true, weights);
        
        if new_loss < current_loss
            success = true;
            break
        end
        step = step / 2;
    end
    new_D = transform_set(M, D, false);
    delta = step * direction;
end

我将它与以下脚本一起使用，您可以通过下载此 post 底部的 2 个输入矩阵进行测试：

clc; clear
m=7  % dimension
n=2   % number of matrices
% Load spectro and WL+GCph+XC
FISH_GCsp = load('Fisher_GCsp_flat.txt');
FISH_XC = load('Fisher_XC_GCph_WL_flat.txt');
% Marginalizing over uncommon parameters between the two matrices
COV_GCsp_first = inv(FISH_GCsp);
COV_XC_first = inv(FISH_XC);
COV_GCsp = COV_GCsp_first(1:m,1:m);
COV_XC = COV_XC_first(1:m,1:m);
% Invert to get Fisher matrix
FISH_sp = inv(COV_GCsp);
FISH_xc = inv(COV_XC);
% Drawing a random set of commuting matrices
C=zeros(n,m,m);
B0=zeros(n,m,m);
C(1,:,:) = FISH_sp
C(2,:,:) = FISH_xc
%[D, B] = qndiag(C, 'max_iter', 1e6, 'tol', 1e-6);
[D, B] = qndiag(C);
% Print diagonal matrices
B*C(1,:,:)*B'
B*C(2,:,:)*B'

但不幸的是，我收到以下错误：

Unable to perform assignment because the size of the left side is 1-by-7-by-7 and the size of the
right side is 6-by-6.
Error in qndiag>transform_set (line 224)
            op(k, :, :) = M * squeeze(D(k, :, :)) * M';
Error in qndiag (line 128)
D = transform_set(B, C, false);
Error in compute_joint_diagonalization (line 32)
[D, B] = qndiag(C);

我不明白函数的用途 squeeze 最重要的：为什么函数 eigs returns 只有 6 个值而不是 7 就像在我的数据中一样（ 输入矩阵有 7x7 大小 ）。

这个数组维度问题可能有什么问题，我该如何解决？

我将可用的 2 个输入文件放在这里：

Matrix Fisher_GCsp_flat.txt

Matrix Fisher_XC_GCph_WL_flat.txt

您可以测试上面为这 2 个矩阵调用 qndiag 的代码。

更新 1

为了让有兴趣的人快速测试代码，我放了一个link的存档：

Archive_Matlab_StackOver.tar

你只需要解压并在Matlab下执行脚本compute_joint_diagonalization.m，你就会正常看到上面的错误（关于eigs和squeeze函数的使用）。

它应该可以帮助您了解此问题的根源。

更新 2

如果我用 [p, d] = eigs(C_mean,7) 替换 [p, d] = eigs(C_mean) ，我会得到另一个错误：

Index in position 1 exceeds array bounds (must not exceed 2).

Error in qndiag>transform_set (line 224)
            op(k, :, :) = M * squeeze(D(k, :, :)) * M';

Error in qndiag (line 128)
D = transform_set(B, C, false);

Error in compute_joint_diagonalization (line 27)
[D, B] = qndiag(C);

但是，使用的 2 个矩阵的维度是 7x7，应该用 eigs(C_mean,7) 正确处理。

更新 3

op, D, M和k的大小等于（包括错误信息后）：

size(D) =
     2     7     7

length(D) =
     7
   
size(M) =
     7     7

size(op) =
     2     7     7

Index in position 1 exceeds array bounds (must not exceed 2).

Error in qndiag>transform_set (line 231)
            op(k, :, :) = M * squeeze(D(k, :, :)) * M';

Error in qndiag (line 128)
D = transform_set(B, C, false);

Error in compute_joint_diagonalization (line 27)
[D, B] = qndiag(C);

请注意 k 从 1 到 length(D)=7。

这些尺寸是否会出现问题？

Answer 1

来自 documentation for eigs:

d = eigs(A) returns a vector of the six largest magnitude eigenvalues of matrix A.

如果你想要全部七个，你需要调用d = eigs(A,7)或d = eig(A)。对于小矩阵（例如 < 1000 x 1000），通常更容易使用 eig 获取所有特征值，而不是使用 eigs.

获取子集

编辑：回应您的“更新 3”

for k=1:length(D) 应替换为 for k=1:n。这需要在两行上进行更改。从您的错误消息来看，它们是第 231 和 236 行。

L = length(X) returns X 中最大数组维度的长度，在你的例子中是 7，即对于第一个维度来说太高了。

无法执行赋值，因为左侧的大小为 1×7×7，右侧的大小为 6×6

Unable to perform assignment because the size of the left side is 1-by-7-by-7 and the size of the right side is 6-by-6

matlab

eigenvalue

eigenvector

diagonal

basis

更新 1

更新 2

更新 3