向量化 MATLAB for 循环
Vectorize MATLAB for loop
我有以下几行代码
y = zeros(n, 1);
for i=1:n
b = L * [u(i:-1:max(1,i-M+1));zeros((-i+M)*(i-M<0),1)];
y(i) = b' * gamma;
end
u
是 nx1,gamma
是 Mx1,L
是 MxM
n
取非常大的值,那么有什么关于如何矢量化 for 循环的想法吗?
编辑:我正在重做解决方案,因为我发现 Matlab 不能很好地处理匿名函数。所以我将调用从匿名函数更改为普通函数。进行此更改:
测试 1
Comparison(40E3, 3E3)
Elapsed time is 21.731176 seconds.
Elapsed time is 251.327347 seconds.
|y2-y1| = 3.1519e-06
测试 2
Comparison(40E3, 1E3)
Elapsed time is 6.407259 seconds.
Elapsed time is 25.551116 seconds.
|y2-y1| = 2.8402e-07
测试 3
Comparison(20E3, 3E3)
Elapsed time is 10.484422 seconds.
Elapsed time is 125.033313 seconds.
|y2-y1| = 1.5462e-06
测试 4
Comparison(20E3, 1E3)
Elapsed time is 3.153404 seconds.
Elapsed time is 13.200649 seconds.
|y2-y1| = 1.5627e-07
函数为:
function Comparison(n, M)
u = rand(n, 1);
L = rand(M);
gamma = rand(M, 1);
tic
y1 = vectorized(u, L, gamma);
toc
tic
y2 = looped(u, L, gamma);
toc
disp(['|y2-y1| = ', num2str(norm(y2 - y1, 1))])
end
function y = vectorized(u, l, gamma)
global a Column
M = length(gamma);
Column = l' * gamma;
x = bsxfun(@plus, -(1:M)', (1:length(u)) + 1);
x(x < 1) = 1;
a = u(x);
a(1:M, 1:M) = a(1:M, 1:M) .* triu(ones(M));
a = a';
m = 1:size(a,1);
y = arrayfun(@VectorY , m)';
end
function yi = VectorY(j)
global a Column
yi = a(j,:) * Column;
end
function y = looped(U, l, gamma)
n = length(U);
M = length(gamma);
u = U';
L = l';
y = zeros(n, 1);
for i=1:n
y(i) = [u(i:-1:max(1,i-M+1)), zeros(1,(-i+M)*(i-M<0))] * L * gamma;
end
end
讨论和解决代码
初步接近
基于矩阵乘法的方法 -
u_pad = [zeros(M-1,1) ; u]; %// Pad u with zeros
idx = bsxfun(@plus,[M:-1:1]',0:n-1);%//'# Calculate sliding indices for u
u_pad_indexed = u_pad(idx); %// Index into padded u
y_vectzed = gamma.'*L*u_pad_indexed;%//'# Matrix-multiplications for final o/p
修改方法
现在,您需要处理大量数据。因此,为了针对这种情况进行优化,可以将数据分解成更小的可运行部分,并且可以迭代地完成操作。然后,每次迭代都会计算输出数组的一部分。
使用这个新策略,初始设置可以完成一次并在每次迭代中重复使用。修改后的方法看起来像这样 -
div_factor = [...] %// Make sure it is a divisor of n
nrows = n/div_factor;
start_idx = bsxfun(@plus,[M:-1:1]',0:nrows-1); %//'
u_pad = [zeros(M-1,1) ; u];
y_vectorized = zeros(div_factor,n/div_factor);
for iter = 1:div_factor
u_pad_indexed = u_pad((iter-1)*nrows + start_idx);
y_vectorized(iter,:) = gamma.'*L*u_pad_indexed; %//'
end
y_vectorized = reshape(y_vectorized.',[],1);
基准测试
%// Size parameters and input arrays
n = 4000000;
M = 1000;
u = rand(n,1);
gamma = rand(M,1);
L = rand(M,M);
%// Warm up tic/toc.
for k = 1:50000
tic(); elapsed = toc();
end
disp('----------- With Original loopy code');
tic
y = zeros(n, 1);
for i=1:n
b = L * [u(i:-1:max(1,i-M+1));zeros((-i+M)*(i-M<0),1)];
y(i) = b' * gamma; %//'
end
toc
clear b y
disp('----------- With Proposed solution code');
tic
..... Proposed Modified Code with div_factor = 200
toc
运行时间
----------- With Original loopy code
Elapsed time is 498.563049 seconds.
----------- With Proposed solution code
Elapsed time is 44.273299 seconds.
我有以下几行代码
y = zeros(n, 1);
for i=1:n
b = L * [u(i:-1:max(1,i-M+1));zeros((-i+M)*(i-M<0),1)];
y(i) = b' * gamma;
end
u
是 nx1,gamma
是 Mx1,L
是 MxM
n
取非常大的值,那么有什么关于如何矢量化 for 循环的想法吗?
编辑:我正在重做解决方案,因为我发现 Matlab 不能很好地处理匿名函数。所以我将调用从匿名函数更改为普通函数。进行此更改:
测试 1
Comparison(40E3, 3E3)
Elapsed time is 21.731176 seconds.
Elapsed time is 251.327347 seconds.
|y2-y1| = 3.1519e-06
测试 2
Comparison(40E3, 1E3)
Elapsed time is 6.407259 seconds.
Elapsed time is 25.551116 seconds.
|y2-y1| = 2.8402e-07
测试 3
Comparison(20E3, 3E3)
Elapsed time is 10.484422 seconds.
Elapsed time is 125.033313 seconds.
|y2-y1| = 1.5462e-06
测试 4
Comparison(20E3, 1E3)
Elapsed time is 3.153404 seconds.
Elapsed time is 13.200649 seconds.
|y2-y1| = 1.5627e-07
函数为:
function Comparison(n, M)
u = rand(n, 1);
L = rand(M);
gamma = rand(M, 1);
tic
y1 = vectorized(u, L, gamma);
toc
tic
y2 = looped(u, L, gamma);
toc
disp(['|y2-y1| = ', num2str(norm(y2 - y1, 1))])
end
function y = vectorized(u, l, gamma)
global a Column
M = length(gamma);
Column = l' * gamma;
x = bsxfun(@plus, -(1:M)', (1:length(u)) + 1);
x(x < 1) = 1;
a = u(x);
a(1:M, 1:M) = a(1:M, 1:M) .* triu(ones(M));
a = a';
m = 1:size(a,1);
y = arrayfun(@VectorY , m)';
end
function yi = VectorY(j)
global a Column
yi = a(j,:) * Column;
end
function y = looped(U, l, gamma)
n = length(U);
M = length(gamma);
u = U';
L = l';
y = zeros(n, 1);
for i=1:n
y(i) = [u(i:-1:max(1,i-M+1)), zeros(1,(-i+M)*(i-M<0))] * L * gamma;
end
end
讨论和解决代码
初步接近
基于矩阵乘法的方法 -
u_pad = [zeros(M-1,1) ; u]; %// Pad u with zeros
idx = bsxfun(@plus,[M:-1:1]',0:n-1);%//'# Calculate sliding indices for u
u_pad_indexed = u_pad(idx); %// Index into padded u
y_vectzed = gamma.'*L*u_pad_indexed;%//'# Matrix-multiplications for final o/p
修改方法
现在,您需要处理大量数据。因此,为了针对这种情况进行优化,可以将数据分解成更小的可运行部分,并且可以迭代地完成操作。然后,每次迭代都会计算输出数组的一部分。
使用这个新策略,初始设置可以完成一次并在每次迭代中重复使用。修改后的方法看起来像这样 -
div_factor = [...] %// Make sure it is a divisor of n
nrows = n/div_factor;
start_idx = bsxfun(@plus,[M:-1:1]',0:nrows-1); %//'
u_pad = [zeros(M-1,1) ; u];
y_vectorized = zeros(div_factor,n/div_factor);
for iter = 1:div_factor
u_pad_indexed = u_pad((iter-1)*nrows + start_idx);
y_vectorized(iter,:) = gamma.'*L*u_pad_indexed; %//'
end
y_vectorized = reshape(y_vectorized.',[],1);
基准测试
%// Size parameters and input arrays
n = 4000000;
M = 1000;
u = rand(n,1);
gamma = rand(M,1);
L = rand(M,M);
%// Warm up tic/toc.
for k = 1:50000
tic(); elapsed = toc();
end
disp('----------- With Original loopy code');
tic
y = zeros(n, 1);
for i=1:n
b = L * [u(i:-1:max(1,i-M+1));zeros((-i+M)*(i-M<0),1)];
y(i) = b' * gamma; %//'
end
toc
clear b y
disp('----------- With Proposed solution code');
tic
..... Proposed Modified Code with div_factor = 200
toc
运行时间
----------- With Original loopy code
Elapsed time is 498.563049 seconds.
----------- With Proposed solution code
Elapsed time is 44.273299 seconds.