向量化 MATLAB for 循环

Vectorize MATLAB for loop

我有以下几行代码

y = zeros(n, 1);

for i=1:n 
    b = L * [u(i:-1:max(1,i-M+1));zeros((-i+M)*(i-M<0),1)];
    y(i) = b' * gamma;
end

u 是 nx1,gamma 是 Mx1,L 是 MxM

n 取非常大的值,那么有什么关于如何矢量化 for 循环的想法吗?

编辑:我正在重做解决方案,因为我发现 Matlab 不能很好地处理匿名函数。所以我将调用从匿名函数更改为普通函数。进行此更改:

测试 1

Comparison(40E3, 3E3)
Elapsed time is 21.731176 seconds.
Elapsed time is 251.327347 seconds.
|y2-y1| = 3.1519e-06

测试 2

Comparison(40E3, 1E3)
Elapsed time is 6.407259 seconds.
Elapsed time is 25.551116 seconds.
|y2-y1| = 2.8402e-07

测试 3

Comparison(20E3, 3E3)
Elapsed time is 10.484422 seconds.
Elapsed time is 125.033313 seconds.
|y2-y1| = 1.5462e-06

测试 4

Comparison(20E3, 1E3)
Elapsed time is 3.153404 seconds.
Elapsed time is 13.200649 seconds.
|y2-y1| = 1.5627e-07

函数为:

function Comparison(n, M)
    u = rand(n, 1);
    L = rand(M);
    gamma = rand(M, 1);

    tic
        y1 = vectorized(u, L, gamma);
    toc

    tic
        y2 = looped(u, L, gamma);
    toc

    disp(['|y2-y1| = ', num2str(norm(y2 - y1, 1))])
end

function y = vectorized(u, l, gamma)
global a Column
    M = length(gamma);
    Column = l' * gamma;

    x = bsxfun(@plus, -(1:M)', (1:length(u)) + 1);
    x(x < 1) = 1;
    a = u(x);
    a(1:M, 1:M) = a(1:M, 1:M) .* triu(ones(M));
    a = a';
    m = 1:size(a,1);
    y = arrayfun(@VectorY , m)';
end

function yi = VectorY(j)
global a Column
    yi = a(j,:) * Column;
end

function y = looped(U, l, gamma)
    n = length(U);
    M = length(gamma);

    u = U';
    L = l';
    y = zeros(n, 1);

    for i=1:n
        y(i) = [u(i:-1:max(1,i-M+1)), zeros(1,(-i+M)*(i-M<0))] * L * gamma;
    end
end

讨论和解决代码

初步接近

基于矩阵乘法的方法 -

u_pad = [zeros(M-1,1) ; u];         %//   Pad u with zeros
idx = bsxfun(@plus,[M:-1:1]',0:n-1);%//'# Calculate sliding indices for u
u_pad_indexed = u_pad(idx);         %//   Index into padded u 
y_vectzed = gamma.'*L*u_pad_indexed;%//'# Matrix-multiplications for final o/p

修改方法

现在,您需要处理大量数据。因此,为了针对这种情况进行优化,可以将数据分解成更小的可运行部分,并且可以迭代地完成操作。然后,每次迭代都会计算输出数组的一部分。

使用这个新策略,初始设置可以完成一次并在每次迭代中重复使用。修改后的方法看起来像这样 -

div_factor = [...]   %// Make sure it is a divisor of n
nrows = n/div_factor;
start_idx =  bsxfun(@plus,[M:-1:1]',0:nrows-1);  %//'
u_pad = [zeros(M-1,1) ; u];

y_vectorized = zeros(div_factor,n/div_factor);
for iter = 1:div_factor
    u_pad_indexed = u_pad((iter-1)*nrows + start_idx);
    y_vectorized(iter,:) = gamma.'*L*u_pad_indexed;  %//'
end
y_vectorized = reshape(y_vectorized.',[],1);

基准测试

%// Size parameters and input arrays
n = 4000000;
M = 1000;
u = rand(n,1);
gamma = rand(M,1);
L = rand(M,M);

%// Warm up tic/toc.
for k = 1:50000
    tic(); elapsed = toc();
end

disp('----------- With Original loopy code');
tic
y = zeros(n, 1);
for i=1:n
    b = L * [u(i:-1:max(1,i-M+1));zeros((-i+M)*(i-M<0),1)];
    y(i) = b' * gamma;  %//'
end
toc
clear b y

disp('----------- With Proposed solution code');
tic
..... Proposed Modified Code with div_factor = 200
toc

运行时间

----------- With Original loopy code
Elapsed time is 498.563049 seconds.
----------- With Proposed solution code
Elapsed time is 44.273299 seconds.