matlab:矢量化单个循环,即对 N 个二进制矩阵进行 ORing
matlab: vectorizing a single loop that is ORing N binary matrices
下面的代码对data
矩阵做了一些计算,数据被放置为--
data = [ ...
1 2 3 4 5 6; ...
1 2 3 4 5 6; ...
1 2 3 4 5 6;]
而我运行的代码是这样的--
[~,col] = size(data) ;
flag1 = bsxfun(@lt, data(:,1), data(:,1).');
flag2 = bsxfun(@gt, data(:,1), data(:,1).');
for cindex = 2:col % can we get rid of this loop ?
flag1 = flag1 | bsxfun(@lt, data(:,cindex), data(:,cindex).');
flag2 = flag2 | bsxfun(@gt, data(:,cindex), data(:,cindex).');
end
此代码所做的是按列主要顺序比较每一行并创建两个二进制值矩阵 flag1
和 flag2
.
有没有办法摆脱这个 for cindex = 2:col
循环?
您需要一些 permuting(rearrange dimensions)
来创建 singleton dimensions
,以便在稍后使用 bsxfun
时进行扩展,这将基本上取代原始发布代码中使用的循环。所以,实现看起来像这样 -
flag1 = any(bsxfun(@lt,permute(data,[1 3 2]),permute(data,[3 1 2])),3);
flag2 = any(bsxfun(@gt,permute(data,[1 3 2]),permute(data,[3 1 2])),3);
我很好奇,我到底得到了多少增益,居然是双倍加速!这是一个基准--
clear all
data = rand(500,500);
[~,col] = size(data);
maxrun = 20 ;
%warm up
for k = 1:50000
tic(); elapsed = toc();
end
toctime = 0 ;
for i = 1:maxrun
flag1 = bsxfun(@lt, data(:,1), data(:,1).');
flag2 = bsxfun(@gt, data(:,1), data(:,1).');
tic
for cindex = 2:col % can we get rid of this loop ?
flag1 = flag1 | bsxfun(@lt, data(:,cindex), data(:,cindex).');
flag2 = flag2 | bsxfun(@gt, data(:,cindex), data(:,cindex).');
end
toctime = toctime + toc ;
end
fprintf('time elapsed: %0.4f sec\n', toctime/maxrun);
toctime = 0 ;
for i = 1:maxrun
flag1 = bsxfun(@lt, data(:,1), data(:,1).');
flag2 = bsxfun(@gt, data(:,1), data(:,1).');
tic
for cindex = 2:col % can we get rid of this loop ?
flag1 = bsxfun(@or, flag1, bsxfun(@lt, data(:,cindex), data(:,cindex).'));
flag2 = bsxfun(@or, flag2, bsxfun(@gt, data(:,cindex), data(:,cindex).'));
end
toctime = toctime + toc ;
end
fprintf('time elapsed: %0.4f sec\n', toctime/maxrun);
toctime = 0 ;
for i = 1:maxrun
tic
flag1 = any(bsxfun(@lt,permute(data,[1 3 2]),permute(data,[3 1 2])),3);
flag2 = any(bsxfun(@gt,permute(data,[1 3 2]),permute(data,[3 1 2])),3);
toctime = toctime + toc ;
end
fprintf('time elapsed: %0.4f sec\n', toctime/maxrun);
fprintf('done.\n');
一件有趣的事需要注意,在两个矩阵上做 |
和做 bsxfun(@or...
几乎相同 --
>> vectest
time elapsed: 0.8609 sec
time elapsed: 0.7914 sec
time elapsed: 0.3285 sec
done.
下面的代码对data
矩阵做了一些计算,数据被放置为--
data = [ ...
1 2 3 4 5 6; ...
1 2 3 4 5 6; ...
1 2 3 4 5 6;]
而我运行的代码是这样的--
[~,col] = size(data) ;
flag1 = bsxfun(@lt, data(:,1), data(:,1).');
flag2 = bsxfun(@gt, data(:,1), data(:,1).');
for cindex = 2:col % can we get rid of this loop ?
flag1 = flag1 | bsxfun(@lt, data(:,cindex), data(:,cindex).');
flag2 = flag2 | bsxfun(@gt, data(:,cindex), data(:,cindex).');
end
此代码所做的是按列主要顺序比较每一行并创建两个二进制值矩阵 flag1
和 flag2
.
有没有办法摆脱这个 for cindex = 2:col
循环?
您需要一些 permuting(rearrange dimensions)
来创建 singleton dimensions
,以便在稍后使用 bsxfun
时进行扩展,这将基本上取代原始发布代码中使用的循环。所以,实现看起来像这样 -
flag1 = any(bsxfun(@lt,permute(data,[1 3 2]),permute(data,[3 1 2])),3);
flag2 = any(bsxfun(@gt,permute(data,[1 3 2]),permute(data,[3 1 2])),3);
我很好奇,我到底得到了多少增益,居然是双倍加速!这是一个基准--
clear all
data = rand(500,500);
[~,col] = size(data);
maxrun = 20 ;
%warm up
for k = 1:50000
tic(); elapsed = toc();
end
toctime = 0 ;
for i = 1:maxrun
flag1 = bsxfun(@lt, data(:,1), data(:,1).');
flag2 = bsxfun(@gt, data(:,1), data(:,1).');
tic
for cindex = 2:col % can we get rid of this loop ?
flag1 = flag1 | bsxfun(@lt, data(:,cindex), data(:,cindex).');
flag2 = flag2 | bsxfun(@gt, data(:,cindex), data(:,cindex).');
end
toctime = toctime + toc ;
end
fprintf('time elapsed: %0.4f sec\n', toctime/maxrun);
toctime = 0 ;
for i = 1:maxrun
flag1 = bsxfun(@lt, data(:,1), data(:,1).');
flag2 = bsxfun(@gt, data(:,1), data(:,1).');
tic
for cindex = 2:col % can we get rid of this loop ?
flag1 = bsxfun(@or, flag1, bsxfun(@lt, data(:,cindex), data(:,cindex).'));
flag2 = bsxfun(@or, flag2, bsxfun(@gt, data(:,cindex), data(:,cindex).'));
end
toctime = toctime + toc ;
end
fprintf('time elapsed: %0.4f sec\n', toctime/maxrun);
toctime = 0 ;
for i = 1:maxrun
tic
flag1 = any(bsxfun(@lt,permute(data,[1 3 2]),permute(data,[3 1 2])),3);
flag2 = any(bsxfun(@gt,permute(data,[1 3 2]),permute(data,[3 1 2])),3);
toctime = toctime + toc ;
end
fprintf('time elapsed: %0.4f sec\n', toctime/maxrun);
fprintf('done.\n');
一件有趣的事需要注意,在两个矩阵上做 |
和做 bsxfun(@or...
几乎相同 --
>> vectest
time elapsed: 0.8609 sec
time elapsed: 0.7914 sec
time elapsed: 0.3285 sec
done.