移除最小点数以获得单调函数
Remove minimum number of points to obtain monotonic function
给定:n 个离散数据点 [ti,xi],它应该描述一个单调函数(ti = 时间,xi = 数据)。有些点是"outliers",或者不服从单调函数规律(x{i+1}>=x{i}表示递增,x{i+1}<=x{i}表示递减)。
我正在尝试寻找一种算法来确定为获得单调函数而必须消除的最少数据点数。我也知道它是在增加还是在减少。
我尝试使用移动中值滤波器并识别出高于或低于过滤函数的一些方差的点,但我无法识别所有点。
解决这个问题的最佳方法是什么?
我用的是MATLAB,但是解决方案肯定可以推广。
我想到了一个用途有限的递归解决方案(因为它不会产生最长的子序列),但也许
- ...它可以根据您的需要进行扩展。
- ...它可以告诉你为什么这不是一个微不足道的问题。
function [pts_to_remove, monoseq] = q48647287
sequence = randi(1000,1000,1,'int16');
[pts_to_remove, monoseq] = remove_pts(sequence, false, 0);
% Now we can try to setdiff different subsets of `monoseq` from `sequence` and rerun the
% algorithm. Among the results we'll take the longest `monoseq`, and repeat the procedure as
% many times as needed.
% Of course it needs to be (dis)proven mathematically whether this approach can result in
% the longest possible subsequence.
end
function [pts_removed, seq] = remove_pts(seq, shouldIncrease, rem_pts_so_far)
if shouldIncrease
d = diff([-Inf; seq]) >= 0;
else
d = diff([ Inf; seq]) <= 0;
end
to_rem = sum(~d);
if to_rem % > 0
pts_removed = remove_pts(seq(d), shouldIncrease, rem_pts_so_far + to_rem);
else
pts_removed = rem_pts_so_far;
end
end
这是一个使用 Patience sort 从给定数组中找到最长递增子序列的解决方案。该解决方案不一定是唯一的,但保证其长度大于或等于任何其他递增子序列。如果您只想知道最长递增子序列的 length,则可以使用更简单的函数。
function subsequence = PatienceLIS(sequence)
% Use patience sort to generate a (not necessarily unique)
% longest increasing subsequence from a given sequence
subInds = []; % indices of subsequence elements
for s = 1:numel(sequence)
% put new element on first stack with top element > sequence(s)
newStack = find(sequence(s) <= [sequence(subInds) inf], 1);
subInds(newStack) = s; % put current index on top of newStack
if (newStack > 1)
% point to the index currently to the left
pred(s) = subInds(newStack - 1);
end
end
% recover the subsequence indices
% last element in subsequence is found from last subInds
pathInds = subInds(end);
while (pred(pathInds(1)) > 0)
pathInds = [pred(pathInds(1)) pathInds]; % add predecessor index to list
end
subsequence = sequence(pathInds); % recover subsequence from indices
end
样本运行:
x = [7 4 11 -1 13 12 10 8 5 14 14 12 15 6 1 3 2 9];
>> PatienceLIS(x)
ans =
4 11 12 14 15
给定:n 个离散数据点 [ti,xi],它应该描述一个单调函数(ti = 时间,xi = 数据)。有些点是"outliers",或者不服从单调函数规律(x{i+1}>=x{i}表示递增,x{i+1}<=x{i}表示递减)。
我正在尝试寻找一种算法来确定为获得单调函数而必须消除的最少数据点数。我也知道它是在增加还是在减少。
我尝试使用移动中值滤波器并识别出高于或低于过滤函数的一些方差的点,但我无法识别所有点。
解决这个问题的最佳方法是什么?
我用的是MATLAB,但是解决方案肯定可以推广。
我想到了一个用途有限的递归解决方案(因为它不会产生最长的子序列),但也许
- ...它可以根据您的需要进行扩展。
- ...它可以告诉你为什么这不是一个微不足道的问题。
function [pts_to_remove, monoseq] = q48647287
sequence = randi(1000,1000,1,'int16');
[pts_to_remove, monoseq] = remove_pts(sequence, false, 0);
% Now we can try to setdiff different subsets of `monoseq` from `sequence` and rerun the
% algorithm. Among the results we'll take the longest `monoseq`, and repeat the procedure as
% many times as needed.
% Of course it needs to be (dis)proven mathematically whether this approach can result in
% the longest possible subsequence.
end
function [pts_removed, seq] = remove_pts(seq, shouldIncrease, rem_pts_so_far)
if shouldIncrease
d = diff([-Inf; seq]) >= 0;
else
d = diff([ Inf; seq]) <= 0;
end
to_rem = sum(~d);
if to_rem % > 0
pts_removed = remove_pts(seq(d), shouldIncrease, rem_pts_so_far + to_rem);
else
pts_removed = rem_pts_so_far;
end
end
这是一个使用 Patience sort 从给定数组中找到最长递增子序列的解决方案。该解决方案不一定是唯一的,但保证其长度大于或等于任何其他递增子序列。如果您只想知道最长递增子序列的 length,则可以使用更简单的函数。
function subsequence = PatienceLIS(sequence)
% Use patience sort to generate a (not necessarily unique)
% longest increasing subsequence from a given sequence
subInds = []; % indices of subsequence elements
for s = 1:numel(sequence)
% put new element on first stack with top element > sequence(s)
newStack = find(sequence(s) <= [sequence(subInds) inf], 1);
subInds(newStack) = s; % put current index on top of newStack
if (newStack > 1)
% point to the index currently to the left
pred(s) = subInds(newStack - 1);
end
end
% recover the subsequence indices
% last element in subsequence is found from last subInds
pathInds = subInds(end);
while (pred(pathInds(1)) > 0)
pathInds = [pred(pathInds(1)) pathInds]; % add predecessor index to list
end
subsequence = sequence(pathInds); % recover subsequence from indices
end
样本运行:
x = [7 4 11 -1 13 12 10 8 5 14 14 12 15 6 1 3 2 9];
>> PatienceLIS(x)
ans =
4 11 12 14 15