按最近时间对齐数据数组

Question

我有 2 个数据向量和相应的时间向量。该数据几乎同时采样，但它们的时间戳略有不同（来自机器精度传输延迟等）。由于遥测问题，一个或两个数据向量偶尔会丢失数据并偶尔出现双样本。

我想将数据数组匹配到它们的时间匹配的位置，以便在它们之间执行一些数学运算。从根本上删除 y1 & y2 中没有对应时间 x1 & x2 的点（在大约 1/2 的采样率内被视为匹配）。

注意我不想插入 y1 & y2

%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1  2 3 4 5 5.1 6   7   8       10  ]; %note double sample at ~5.
x2 = [.9       4.9    5.9 6.9 8.1 9.1 10.1]; %Slightly different times.

%Sample data:  y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8    10];
y2 = [2       6   7 8 9 10 11];

所以结果应该是这样的：

y1_m = [1 5 6 7 8 10];
y2_m = [2 6 7 8 9 11];

我目前拥有的： 我使用 interp1 找到 2 个时间数组之间最近的时间点。然后像这样得到它们之间的时间差：

>> idx = interp1(x2,1:numel(x2),x1,'nearest','extrap')
idx =
     1     1     2     2     2     2     3     4     5     7

>> xDelta = abs(x2(idx) - x1)
xDelta =
    0.1000    1.1000    1.9000    0.9000    0.1000    0.2000    0.1000    0.1000    0.1000    0.1000

现在我认为我需要做的是为每个唯一 idx 找到最小值 xDelta，这应该让我得到所有匹配点。然而，我还没有想出一个聪明的方法来做到这一点......似乎 accumarray 在这里应该有用，但到目前为止我没有使用它。

Answer 1

这是一个粗略的想法，您可以使用 unique and ismembertol:

对其进行改进

function [y1_m, y2_m] = q48723002
%% Stage 0 - Setup:
%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1  2 3 4 5   5.1 6   7   8       10  ]; %note double sample at ~5.
x2 = [.9       4.9     5.9 6.9 8.1 9.1 10.1]; %Slightly different times.

%Sample data:  y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8    10];
y2 = [2       6   7 8 9 10 11];
%% Stage 1 - Remove repeating samples:
SR = 0.5; % Sampling rate, for rounding.
[~,Loc1] = ismembertol(x1,round(x1/SR)*SR,SR/2,'DataScale',1);
[~,Loc2] = ismembertol(x2,round(x2/SR)*SR,SR/2,'DataScale',1);
u1 = unique(Loc1);
u2 = unique(Loc2);
x1u = x1(u1);
y1u = y1(u1);
x2u = x2(u2);
y2u = y2(u2);
clear Loc1 Loc2
%% Stage 2 - Get a vector of reference time steps:
ut = union(u1,u2);
%% Stage 3 - Only keep times found in both
[In1,Loc1] = ismembertol(ut,x1u,SR/2,'DataScale',1);
[In2,Loc2] = ismembertol(ut,x2u,SR/2,'DataScale',1);
valid = In1 & In2;
%% Stage 4 - Output:
y1_m = ut(Loc1(valid)); % equivalently: y1_m = ut(valid)
y2_m = y1_m + 1;

ans =

     1     5     6     7     8     9

另请参阅：uniquetol。

Answer 2

这是一个基于@Cris Luengo 对原始问题的评论的解决方案。

它使用 sortrows & unique 来获得每对数据点的最低时间误差。

%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1  2 3 4 5 5.1 6   7   8       10  ]; %note double sample at ~5.
x2 = [.9       4.9    5.9 6.9 8.1 9.1 10.1]; %Slightly different times.

%Sample data:  y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8    10];
y2 = [2       6   7 8 9 10 11];

%Find the nearest match
idx   = interp1(x2,1:numel(x2),x1,'nearest','extrap');
xDiff = abs(x2(idx) - x1);

% Combine the matched indices & the deltas together & sort by rows.
%So lowest delta for a given index is first.
[A, idx1]    = sortrows([idx(:) xDiff(:)]);
[idx2, uidx] = unique(A(:,1),'first');
idx1         = idx1(uidx); %resort idx1

%output
y1_m = y1(idx1)
y2_m = y2(idx2)


y1_m =
     1     5     6     7     8    10
y2_m =
     2     6     7     8     9    11

按最近时间对齐数据数组

Aligning data arrays by closest time

indexing

matlab

time-series

sampling

closest