如何向量化计算齐次变换matrix/tensor?
How to vectorize calculation of homogenous transformation matrix/tensor?
对于我的模拟,我需要计算许多变换矩阵,因此我想矢量化我现在正在使用的 for 循环。
有没有办法向量化现有的 for 循环,或者我可能需要另一种方法来计算向量和矩阵?
我准备了一个小例子:
n_dim = 1e5;
p1_3 = zeros(3,n_dim); % translationvector (no trans.) [3x100000]
tx = ones(1,n_dim)*15./180*pi; % turn angle around x-axis (fixed) [1x100000]
ty = zeros(1,n_dim); % turn angle around y-axis (no turn) [1x100000]
tz = randi([-180 180], 1, n_dim)./180*pi; % turn angle around z-axis (different turn) [1x100000]
hom = [0 0 0 1].*ones(n_dim,4); % vector needed for homogenous transformation [100000x4]
% calculate sin/cosin values for rotation [100000x1 each]
cx = cos(tx)';
sx = sin(tx)';
cy = cos(ty)';
sy = sin(ty)';
cz = cos(tz)';
sz = sin(tz)';
% calculate rotation matrix [300000x3]
R_full = [ cy.*cz, -cy.*sz, sy; ...
cx.*sz+sx.*sy.*cz, cx.*cz-sx.*sy.*sz, -sx.*cy; ...
sx.*sz-cx.*sy.*cz, cz.*sx+cx.*sy.*sz, cx.*cy];
% prealocate transformation tensor
T = zeros(4,4,n_dim);
% create transformation tensor here
% T = [R11 R12 R13 p1;
% R21 R22 R23 p2;
% R31 R32 R33 p3;
% 0 0 0 1]
tic
for i = 1:n_dim
T(:,:,i) = [[R_full(i,1), R_full(i,2), R_full(i,3); ...
R_full(n_dim+i,1), R_full(n_dim+i,2), R_full(n_dim+i,3); ...
R_full(2*n_dim+i,1), R_full(2*n_dim+i,2), R_full(2*n_dim+i,3)], p1_3(:,i);
hom(i,:)];
end
toc
编辑
我收录了,他当然赢了
准备好疯狂索引 foo 了吗?我们开始吧:
clear all;
close all;
clc;
n_dim_max = 200;
t_loop = zeros(n_dim_max, 1);
t_indexing = t_loop;
t_permute = t_loop;
fprintf("---------------------------------------------------------------\n");
for n_dim = 1:n_dim_max
p1_3 = zeros(3,n_dim); % translationvector (no trans.) [3x100000]
tx = ones(1,n_dim)*15./180*pi; % turn angle around x-axis (fixed) [1x100000]
ty = zeros(1,n_dim); % turn angle around y-axis (no turn) [1x100000]
tz = randi([-180 180], 1, n_dim)./180*pi; % turn angle around z-axis (different turn) [1x100000]
hom = [0 0 0 1].*ones(n_dim,4); % vector needed for homogenous transformation [100000x4]
% calculate sin/cosin values for rotation [100000x1 each]
cx = cos(tx)';
sx = sin(tx)';
cy = cos(ty)';
sy = sin(ty)';
cz = cos(tz)';
sz = sin(tz)';
% calculate rotation matrix [300000x3]
R_full = [ cy.*cz, -cy.*sz, sy; ...
cx.*sz+sx.*sy.*cz, cx.*cz-sx.*sy.*sz, -sx.*cy; ...
sx.*sz-cx.*sy.*cz, cz.*sx+cx.*sy.*sz, cx.*cy];
% prealocate transformation tensor
T = zeros(4,4,n_dim);
% create transformation tensor here
% T = [R11 R12 R13 p1;
% R21 R22 R23 p2;
% R31 R32 R33 p3;
% 0 0 0 1]
tic
for i = 1:n_dim
T(:,:,i) = [[R_full(i,1), R_full(i,2), R_full(i,3); ...
R_full(n_dim+i,1), R_full(n_dim+i,2), R_full(n_dim+i,3); ...
R_full(2*n_dim+i,1), R_full(2*n_dim+i,2), R_full(2*n_dim+i,3)], p1_3(:,i);
hom(i,:)];
end
t_loop(n_dim) = toc;
tic
% prealocate transformation tensor
TT = zeros(4, 4);
TT(end) = 1;
TT = repmat(TT, 1, 1, n_dim);
% Crazy index finding.
temp = repmat(1:(3*n_dim):(3*3*n_dim), 3, 1) + n_dim .* ((0:2).' * ones(1, 3));
temp = repmat(temp, 1, 1, n_dim);
t = zeros(1, 1, n_dim);
t(:) = 0:(n_dim-1);
temp = temp + ones(3, 3, n_dim) .* t;
% Direct assignment using crazily found indices.
TT(1:3, 1:3, :) = R_full(temp);
t_indexing(n_dim) = toc;
tic
% prealocate transformation tensor
TTT = zeros(4, 4);
TTT(end) = 1;
TTT = repmat(TTT, 1, 1, n_dim);
TTT(1:3, 1:3, :) = permute(reshape(R_full, n_dim, 3, 3), [2, 3, 1]);
t_permute(n_dim) = toc;
% Check
fprintf("n_dim: %d\n", n_dim);
fprintf("T equals TT: %d\n", (sum(T(:) == TT(:))) == (4 * 4 * n_dim));
fprintf("T equals TTT: %d\n", (sum(T(:) == TTT(:))) == (4 * 4 * n_dim));
fprintf("---------------------------------------------------------------\n");
end
figure(1);
plot(1:n_dim_max, t_loop, 1:n_dim_max, t_indexing, 1:n_dim_max, t_permute);
legend({'Loop', 'Indexing', 'Permute'});
xlabel('Dimension');
ylabel('Elapsed time [s]');
抱歉,脚本太长了,因为它是您的初始解决方案、我的解决方案(以及 )和测试脚本的一体。懒惰的星期五是我不能正确拆分事情的原因...
我是怎么到那里的?简单"reverse engineering"。我采用了 n_dim = [2, 3, 4]
的解决方案并确定了 [~, ii] = ismember(T(1:3, 1:3, :), R_full)
,即 R_full
到 T(1:3, 1:3, :)
的映射。然后,我分析了索引方案,并找到了适当的解决方案来模拟任意 n_dim
的映射。完毕! ;-) 是的,我喜欢疯狂的索引东西。
试试这个:
T = permute(reshape(R_full,n_dim,3,3),[2,3,1]);
T(4,4,:) = 1;
你的方法:
Elapsed time is 0.839315 seconds.
这个方法:
Elapsed time is 0.015389 seconds.
对于我的模拟,我需要计算许多变换矩阵,因此我想矢量化我现在正在使用的 for 循环。
有没有办法向量化现有的 for 循环,或者我可能需要另一种方法来计算向量和矩阵?
我准备了一个小例子:
n_dim = 1e5;
p1_3 = zeros(3,n_dim); % translationvector (no trans.) [3x100000]
tx = ones(1,n_dim)*15./180*pi; % turn angle around x-axis (fixed) [1x100000]
ty = zeros(1,n_dim); % turn angle around y-axis (no turn) [1x100000]
tz = randi([-180 180], 1, n_dim)./180*pi; % turn angle around z-axis (different turn) [1x100000]
hom = [0 0 0 1].*ones(n_dim,4); % vector needed for homogenous transformation [100000x4]
% calculate sin/cosin values for rotation [100000x1 each]
cx = cos(tx)';
sx = sin(tx)';
cy = cos(ty)';
sy = sin(ty)';
cz = cos(tz)';
sz = sin(tz)';
% calculate rotation matrix [300000x3]
R_full = [ cy.*cz, -cy.*sz, sy; ...
cx.*sz+sx.*sy.*cz, cx.*cz-sx.*sy.*sz, -sx.*cy; ...
sx.*sz-cx.*sy.*cz, cz.*sx+cx.*sy.*sz, cx.*cy];
% prealocate transformation tensor
T = zeros(4,4,n_dim);
% create transformation tensor here
% T = [R11 R12 R13 p1;
% R21 R22 R23 p2;
% R31 R32 R33 p3;
% 0 0 0 1]
tic
for i = 1:n_dim
T(:,:,i) = [[R_full(i,1), R_full(i,2), R_full(i,3); ...
R_full(n_dim+i,1), R_full(n_dim+i,2), R_full(n_dim+i,3); ...
R_full(2*n_dim+i,1), R_full(2*n_dim+i,2), R_full(2*n_dim+i,3)], p1_3(:,i);
hom(i,:)];
end
toc
编辑
我收录了
准备好疯狂索引 foo 了吗?我们开始吧:
clear all;
close all;
clc;
n_dim_max = 200;
t_loop = zeros(n_dim_max, 1);
t_indexing = t_loop;
t_permute = t_loop;
fprintf("---------------------------------------------------------------\n");
for n_dim = 1:n_dim_max
p1_3 = zeros(3,n_dim); % translationvector (no trans.) [3x100000]
tx = ones(1,n_dim)*15./180*pi; % turn angle around x-axis (fixed) [1x100000]
ty = zeros(1,n_dim); % turn angle around y-axis (no turn) [1x100000]
tz = randi([-180 180], 1, n_dim)./180*pi; % turn angle around z-axis (different turn) [1x100000]
hom = [0 0 0 1].*ones(n_dim,4); % vector needed for homogenous transformation [100000x4]
% calculate sin/cosin values for rotation [100000x1 each]
cx = cos(tx)';
sx = sin(tx)';
cy = cos(ty)';
sy = sin(ty)';
cz = cos(tz)';
sz = sin(tz)';
% calculate rotation matrix [300000x3]
R_full = [ cy.*cz, -cy.*sz, sy; ...
cx.*sz+sx.*sy.*cz, cx.*cz-sx.*sy.*sz, -sx.*cy; ...
sx.*sz-cx.*sy.*cz, cz.*sx+cx.*sy.*sz, cx.*cy];
% prealocate transformation tensor
T = zeros(4,4,n_dim);
% create transformation tensor here
% T = [R11 R12 R13 p1;
% R21 R22 R23 p2;
% R31 R32 R33 p3;
% 0 0 0 1]
tic
for i = 1:n_dim
T(:,:,i) = [[R_full(i,1), R_full(i,2), R_full(i,3); ...
R_full(n_dim+i,1), R_full(n_dim+i,2), R_full(n_dim+i,3); ...
R_full(2*n_dim+i,1), R_full(2*n_dim+i,2), R_full(2*n_dim+i,3)], p1_3(:,i);
hom(i,:)];
end
t_loop(n_dim) = toc;
tic
% prealocate transformation tensor
TT = zeros(4, 4);
TT(end) = 1;
TT = repmat(TT, 1, 1, n_dim);
% Crazy index finding.
temp = repmat(1:(3*n_dim):(3*3*n_dim), 3, 1) + n_dim .* ((0:2).' * ones(1, 3));
temp = repmat(temp, 1, 1, n_dim);
t = zeros(1, 1, n_dim);
t(:) = 0:(n_dim-1);
temp = temp + ones(3, 3, n_dim) .* t;
% Direct assignment using crazily found indices.
TT(1:3, 1:3, :) = R_full(temp);
t_indexing(n_dim) = toc;
tic
% prealocate transformation tensor
TTT = zeros(4, 4);
TTT(end) = 1;
TTT = repmat(TTT, 1, 1, n_dim);
TTT(1:3, 1:3, :) = permute(reshape(R_full, n_dim, 3, 3), [2, 3, 1]);
t_permute(n_dim) = toc;
% Check
fprintf("n_dim: %d\n", n_dim);
fprintf("T equals TT: %d\n", (sum(T(:) == TT(:))) == (4 * 4 * n_dim));
fprintf("T equals TTT: %d\n", (sum(T(:) == TTT(:))) == (4 * 4 * n_dim));
fprintf("---------------------------------------------------------------\n");
end
figure(1);
plot(1:n_dim_max, t_loop, 1:n_dim_max, t_indexing, 1:n_dim_max, t_permute);
legend({'Loop', 'Indexing', 'Permute'});
xlabel('Dimension');
ylabel('Elapsed time [s]');
抱歉,脚本太长了,因为它是您的初始解决方案、我的解决方案(以及
我是怎么到那里的?简单"reverse engineering"。我采用了 n_dim = [2, 3, 4]
的解决方案并确定了 [~, ii] = ismember(T(1:3, 1:3, :), R_full)
,即 R_full
到 T(1:3, 1:3, :)
的映射。然后,我分析了索引方案,并找到了适当的解决方案来模拟任意 n_dim
的映射。完毕! ;-) 是的,我喜欢疯狂的索引东西。
试试这个:
T = permute(reshape(R_full,n_dim,3,3),[2,3,1]);
T(4,4,:) = 1;
你的方法:
Elapsed time is 0.839315 seconds.
这个方法:
Elapsed time is 0.015389 seconds.