按最近时间对齐数据数组
Aligning data arrays by closest time
我有 2 个数据向量和相应的时间向量。该数据几乎同时采样,但它们的时间戳略有不同(来自机器精度传输延迟等)。由于遥测问题,一个或两个数据向量偶尔会丢失数据并偶尔出现双样本。
我想将数据数组匹配到它们的时间匹配的位置,以便在它们之间执行一些数学运算。从根本上删除 y1
& y2
中没有对应时间 x1
& x2
的点(在大约 1/2 的采样率内被视为匹配)。
注意我不想插入 y1
& y2
%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1 2 3 4 5 5.1 6 7 8 10 ]; %note double sample at ~5.
x2 = [.9 4.9 5.9 6.9 8.1 9.1 10.1]; %Slightly different times.
%Sample data: y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8 10];
y2 = [2 6 7 8 9 10 11];
所以结果应该是这样的:
y1_m = [1 5 6 7 8 10];
y2_m = [2 6 7 8 9 11];
我目前拥有的: 我使用 interp1
找到 2 个时间数组之间最近的时间点。然后像这样得到它们之间的时间差:
>> idx = interp1(x2,1:numel(x2),x1,'nearest','extrap')
idx =
1 1 2 2 2 2 3 4 5 7
>> xDelta = abs(x2(idx) - x1)
xDelta =
0.1000 1.1000 1.9000 0.9000 0.1000 0.2000 0.1000 0.1000 0.1000 0.1000
现在我 认为 我需要做的是为每个唯一 idx
找到最小值 xDelta
,这应该让我得到所有匹配点。然而,我还没有想出一个聪明的方法来做到这一点......似乎 accumarray
在这里应该有用,但到目前为止我没有使用它。
这是一个粗略的想法,您可以使用 unique
and ismembertol
:
对其进行改进
function [y1_m, y2_m] = q48723002
%% Stage 0 - Setup:
%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1 2 3 4 5 5.1 6 7 8 10 ]; %note double sample at ~5.
x2 = [.9 4.9 5.9 6.9 8.1 9.1 10.1]; %Slightly different times.
%Sample data: y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8 10];
y2 = [2 6 7 8 9 10 11];
%% Stage 1 - Remove repeating samples:
SR = 0.5; % Sampling rate, for rounding.
[~,Loc1] = ismembertol(x1,round(x1/SR)*SR,SR/2,'DataScale',1);
[~,Loc2] = ismembertol(x2,round(x2/SR)*SR,SR/2,'DataScale',1);
u1 = unique(Loc1);
u2 = unique(Loc2);
x1u = x1(u1);
y1u = y1(u1);
x2u = x2(u2);
y2u = y2(u2);
clear Loc1 Loc2
%% Stage 2 - Get a vector of reference time steps:
ut = union(u1,u2);
%% Stage 3 - Only keep times found in both
[In1,Loc1] = ismembertol(ut,x1u,SR/2,'DataScale',1);
[In2,Loc2] = ismembertol(ut,x2u,SR/2,'DataScale',1);
valid = In1 & In2;
%% Stage 4 - Output:
y1_m = ut(Loc1(valid)); % equivalently: y1_m = ut(valid)
y2_m = y1_m + 1;
ans =
1 5 6 7 8 9
另请参阅:uniquetol
。
这是一个基于@Cris Luengo 对原始问题的评论的解决方案。
它使用 sortrows
& unique
来获得每对数据点的最低时间误差。
%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1 2 3 4 5 5.1 6 7 8 10 ]; %note double sample at ~5.
x2 = [.9 4.9 5.9 6.9 8.1 9.1 10.1]; %Slightly different times.
%Sample data: y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8 10];
y2 = [2 6 7 8 9 10 11];
%Find the nearest match
idx = interp1(x2,1:numel(x2),x1,'nearest','extrap');
xDiff = abs(x2(idx) - x1);
% Combine the matched indices & the deltas together & sort by rows.
%So lowest delta for a given index is first.
[A, idx1] = sortrows([idx(:) xDiff(:)]);
[idx2, uidx] = unique(A(:,1),'first');
idx1 = idx1(uidx); %resort idx1
%output
y1_m = y1(idx1)
y2_m = y2(idx2)
y1_m =
1 5 6 7 8 10
y2_m =
2 6 7 8 9 11
我有 2 个数据向量和相应的时间向量。该数据几乎同时采样,但它们的时间戳略有不同(来自机器精度传输延迟等)。由于遥测问题,一个或两个数据向量偶尔会丢失数据并偶尔出现双样本。
我想将数据数组匹配到它们的时间匹配的位置,以便在它们之间执行一些数学运算。从根本上删除 y1
& y2
中没有对应时间 x1
& x2
的点(在大约 1/2 的采样率内被视为匹配)。
注意我不想插入 y1
& y2
%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1 2 3 4 5 5.1 6 7 8 10 ]; %note double sample at ~5.
x2 = [.9 4.9 5.9 6.9 8.1 9.1 10.1]; %Slightly different times.
%Sample data: y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8 10];
y2 = [2 6 7 8 9 10 11];
所以结果应该是这样的:
y1_m = [1 5 6 7 8 10];
y2_m = [2 6 7 8 9 11];
我目前拥有的: 我使用 interp1
找到 2 个时间数组之间最近的时间点。然后像这样得到它们之间的时间差:
>> idx = interp1(x2,1:numel(x2),x1,'nearest','extrap')
idx =
1 1 2 2 2 2 3 4 5 7
>> xDelta = abs(x2(idx) - x1)
xDelta =
0.1000 1.1000 1.9000 0.9000 0.1000 0.2000 0.1000 0.1000 0.1000 0.1000
现在我 认为 我需要做的是为每个唯一 idx
找到最小值 xDelta
,这应该让我得到所有匹配点。然而,我还没有想出一个聪明的方法来做到这一点......似乎 accumarray
在这里应该有用,但到目前为止我没有使用它。
这是一个粗略的想法,您可以使用 unique
and ismembertol
:
function [y1_m, y2_m] = q48723002
%% Stage 0 - Setup:
%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1 2 3 4 5 5.1 6 7 8 10 ]; %note double sample at ~5.
x2 = [.9 4.9 5.9 6.9 8.1 9.1 10.1]; %Slightly different times.
%Sample data: y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8 10];
y2 = [2 6 7 8 9 10 11];
%% Stage 1 - Remove repeating samples:
SR = 0.5; % Sampling rate, for rounding.
[~,Loc1] = ismembertol(x1,round(x1/SR)*SR,SR/2,'DataScale',1);
[~,Loc2] = ismembertol(x2,round(x2/SR)*SR,SR/2,'DataScale',1);
u1 = unique(Loc1);
u2 = unique(Loc2);
x1u = x1(u1);
y1u = y1(u1);
x2u = x2(u2);
y2u = y2(u2);
clear Loc1 Loc2
%% Stage 2 - Get a vector of reference time steps:
ut = union(u1,u2);
%% Stage 3 - Only keep times found in both
[In1,Loc1] = ismembertol(ut,x1u,SR/2,'DataScale',1);
[In2,Loc2] = ismembertol(ut,x2u,SR/2,'DataScale',1);
valid = In1 & In2;
%% Stage 4 - Output:
y1_m = ut(Loc1(valid)); % equivalently: y1_m = ut(valid)
y2_m = y1_m + 1;
ans =
1 5 6 7 8 9
另请参阅:uniquetol
。
这是一个基于@Cris Luengo 对原始问题的评论的解决方案。
它使用 sortrows
& unique
来获得每对数据点的最低时间误差。
%Sample time stamps: Real ones are much faster and not as neat.
x1 = [1 2 3 4 5 5.1 6 7 8 10 ]; %note double sample at ~5.
x2 = [.9 4.9 5.9 6.9 8.1 9.1 10.1]; %Slightly different times.
%Sample data: y is basically y1+1 if no data was missing
y1 = [1 2 3 4 5 5 6 7 8 10];
y2 = [2 6 7 8 9 10 11];
%Find the nearest match
idx = interp1(x2,1:numel(x2),x1,'nearest','extrap');
xDiff = abs(x2(idx) - x1);
% Combine the matched indices & the deltas together & sort by rows.
%So lowest delta for a given index is first.
[A, idx1] = sortrows([idx(:) xDiff(:)]);
[idx2, uidx] = unique(A(:,1),'first');
idx1 = idx1(uidx); %resort idx1
%output
y1_m = y1(idx1)
y2_m = y2(idx2)
y1_m =
1 5 6 7 8 10
y2_m =
2 6 7 8 9 11