根据序列大小替换重复值 - Matlab

Replace repeated value based on sequence size - Matlab

我有一个由 1 和 0 组成的二维矩阵。

mat = [0 0 0 0 1 1 1 0 0
       1 1 1 1 1 0 0 1 0
       0 0 1 0 1 1 0 0 1];

我需要在每一行中找到所有连续的重复,并用零替换所有的 仅当序列大小小于 5(5 个连续的 1)时

mat = [0 0 0 0 0 0 0 0 0
       1 1 1 1 1 0 0 0 0
       0 0 0 0 0 0 0 0 0];

非常欢迎任何关于如何解决此问题的建议。

您可以使用 diff 找到 1 的运行的起点和终点,以及基于此的一些逻辑将太短的运行归零。请参阅下面的代码和相关评论

% Input matrix of 0s and 1s
mat = [0 0 0 0 1 1 1 0 0
       1 1 1 1 1 0 0 1 0
       0 0 1 0 1 1 0 0 1];
% Minimum run length of 1s to keep   
N = 5;

% Get the start and end points of the runs of 1. Add in values from the
% original matrix to ensure that start and end points are always paired
d = [mat(:,1),diff(mat,1,2),-mat(:,end)];
% Find those start and end points. Use the transpose during the find to
% flip rows/cols and search row-wise relative to input matrix.
[cs,r] = find(d.'>0.5);  % Start points
[ce,~] = find(d.'<-0.5); % End points
c = [cs, ce];            % Column number array for start/end
idx = diff(c,1,2) < N;   % From column number, check run length vs N

% Loop over the runs which didn't satisfy the threshold and zero them
for ii = find(idx.')
    mat(r(ii),c(ii,1):c(ii,2)-1) = 0;
end

如果您想放弃 window 的易读性,可以基于完全相同的逻辑将其压缩为稍微更快、更密集的版本:

[c,r] = find([mat(:,1),diff(mat,1,2),-mat(:,end)].'); % find run start/end points
for ii = 1:2:numel(c)     % Loop over runs
    if c(ii+1)-c(ii) < N  % Check if run exceeds threshold length
        mat(r(ii),c(ii):c(ii+1)-1) = 0; % Zero the run if not
    end
end

@Wolfie 的矢量化解决方案简洁明了,但有点难以理解,而且与问题的措辞相去甚远。这是使用循环的问题的直接翻译。它的优点是更容易理解,并且在内存分配较少的情况下速度稍快,这意味着它适用于大量输入。

[m,n] = size(mat);
for i = 1:m
    j = 1;
    while j <= n
        seqSum = 1;
        if mat(i,j) == 1
            for k = j+1:n
                if mat(i,k) == 1
                    seqSum = seqSum + 1;
                else
                    break
                end
            end
            if seqSum < 5
                mat(i,j:j+seqSum-1) = 0;
            end
        end
        j = j + seqSum;
    end
end