在张量内向量化矩阵乘法
Vectorizing matrix multiplication inside a tensor
我在矢量化部分代码时遇到了一些问题。我有一个 (n,n,m) 张量,我想将 m 中的每个切片乘以第二个(n x n)矩阵(不是按元素计算)。
这是 for 循环的样子:
Tensor=zeros(2,2,3);
Matrix = [1,2; 3,4];
for j=1:n
Matrices_Multiplied = Tensor(:,:,j)*Matrix;
Recursive_Matrix=Recursive_Matrix + Tensor(:,:,j)/trace(Matrices_Multiplied);
end
如何以矢量化方式对张量内的各个矩阵执行矩阵乘法?是否有像 tensor-dot 这样的内置函数可以处理这个或者它更聪明?
Bsxfunning
and using efficient matrix-multiplication
,我们可以做到 -
% Calculate trace values using matrix-multiplication
T = reshape(Matrix.',1,[])*reshape(Tensor,[],size(Tensor,3));
% Use broadcasting to perform elementwise division across all slices
out = sum(bsxfun(@rdivide,Tensor,reshape(T,1,1,[])),3);
同样,可以用一个矩阵乘法代替最后一步,以进一步提高性能。因此,全矩阵乘法专用解决方案将是 -
[m,n,r] = size(Tensor);
out = reshape(reshape(Tensor,[],size(Tensor,3))*(1./T.'),m,n)
运行时测试
基准代码-
% Input arrays
n = 100; m = 100;
Tensor=rand(n,n,m);
Matrix=rand(n,n);
num_iter = 100; % Number of iterations to be run for
tic
disp('------------ Loopy woopy doops : ')
for iter = 1:num_iter
Recursive_Matrix = zeros(n,n);
for j=1:n
Matrices_Multiplied = Tensor(:,:,j)*Matrix;
Recursive_Matrix=Recursive_Matrix+Tensor(:,:,j)/trace(Matrices_Multiplied);
end
end
toc, clear iter Recursive_Matrix Matrices_Multiplied
tic
disp('------------- Bsxfun matrix-mul not so dull : ')
for iter = 1:num_iter
T = reshape(Matrix.',1,[])*reshape(Tensor,[],size(Tensor,3));
out = sum(bsxfun(@rdivide,Tensor,reshape(T,1,1,[])),3);
end
toc, clear T out
tic
disp('-------------- All matrix-mul having a ball : ')
for iter = 1:num_iter
T = reshape(Matrix.',1,[])*reshape(Tensor,[],size(Tensor,3));
[m,n,r] = size(Tensor);
out = reshape(reshape(Tensor,[],size(Tensor,3))*(1./T.'),m,n);
end
toc
计时 -
------------ Loopy woopy doops :
Elapsed time is 3.339464 seconds.
------------- Bsxfun matrix-mul not so dull :
Elapsed time is 1.354137 seconds.
-------------- All matrix-mul having a ball :
Elapsed time is 0.373712 seconds.
我在矢量化部分代码时遇到了一些问题。我有一个 (n,n,m) 张量,我想将 m 中的每个切片乘以第二个(n x n)矩阵(不是按元素计算)。
这是 for 循环的样子:
Tensor=zeros(2,2,3);
Matrix = [1,2; 3,4];
for j=1:n
Matrices_Multiplied = Tensor(:,:,j)*Matrix;
Recursive_Matrix=Recursive_Matrix + Tensor(:,:,j)/trace(Matrices_Multiplied);
end
如何以矢量化方式对张量内的各个矩阵执行矩阵乘法?是否有像 tensor-dot 这样的内置函数可以处理这个或者它更聪明?
Bsxfunning
and using efficient matrix-multiplication
,我们可以做到 -
% Calculate trace values using matrix-multiplication
T = reshape(Matrix.',1,[])*reshape(Tensor,[],size(Tensor,3));
% Use broadcasting to perform elementwise division across all slices
out = sum(bsxfun(@rdivide,Tensor,reshape(T,1,1,[])),3);
同样,可以用一个矩阵乘法代替最后一步,以进一步提高性能。因此,全矩阵乘法专用解决方案将是 -
[m,n,r] = size(Tensor);
out = reshape(reshape(Tensor,[],size(Tensor,3))*(1./T.'),m,n)
运行时测试
基准代码-
% Input arrays
n = 100; m = 100;
Tensor=rand(n,n,m);
Matrix=rand(n,n);
num_iter = 100; % Number of iterations to be run for
tic
disp('------------ Loopy woopy doops : ')
for iter = 1:num_iter
Recursive_Matrix = zeros(n,n);
for j=1:n
Matrices_Multiplied = Tensor(:,:,j)*Matrix;
Recursive_Matrix=Recursive_Matrix+Tensor(:,:,j)/trace(Matrices_Multiplied);
end
end
toc, clear iter Recursive_Matrix Matrices_Multiplied
tic
disp('------------- Bsxfun matrix-mul not so dull : ')
for iter = 1:num_iter
T = reshape(Matrix.',1,[])*reshape(Tensor,[],size(Tensor,3));
out = sum(bsxfun(@rdivide,Tensor,reshape(T,1,1,[])),3);
end
toc, clear T out
tic
disp('-------------- All matrix-mul having a ball : ')
for iter = 1:num_iter
T = reshape(Matrix.',1,[])*reshape(Tensor,[],size(Tensor,3));
[m,n,r] = size(Tensor);
out = reshape(reshape(Tensor,[],size(Tensor,3))*(1./T.'),m,n);
end
toc
计时 -
------------ Loopy woopy doops :
Elapsed time is 3.339464 seconds.
------------- Bsxfun matrix-mul not so dull :
Elapsed time is 1.354137 seconds.
-------------- All matrix-mul having a ball :
Elapsed time is 0.373712 seconds.