与 bsxfun 相比,隐式扩展快多少?
How much faster is implicit expansion compared with bsxfun?
与 commented by Steve Eddins, implicit expansion (introduced in Matlab R2016b) is faster than bsxfun
对于小数组大小一样,对于大数组具有相似的速度:
In R2016b, implicit expansion works as fast or faster than bsxfun in most cases. The best performance gains for implicit expansion are with small matrix and array sizes. For large matrix sizes, implicit expansion tends to be roughly the same speed as bsxfun
.
此外,扩展发生的维度可能会产生影响:
When there is an expansion in the first dimension, the operators might not be quite as fast as bsxfun
.
(感谢@Poelie and @rayryeng for letting me 这个!)
自然会出现两个问题:
- 与
bsxfun
相比,隐式扩展快多少?
- 对于哪些阵列大小或形状,差异显着?
为了衡量速度上的差异,已经进行了一些测试。测试考虑 两种不同的操作:
- 加法
- 功率
和四种不同形状的待操作数组:
N×N
数组与 N×1
数组
N×N×N×N
数组与 N×1×N
数组
N×N
数组与 1×N
数组
N×N×N×N
数组与 1×N×N
数组
对于运算和数组形状的八种组合中的每一种,使用隐式扩展和 bsxfun
完成相同的运算。 使用了N
的几个值,涵盖了从小数组到大数组的范围。 timeit
用于可靠计时。
基准代码在这个答案的末尾给出。在 Matlab R2016b 上 运行,Windows 10,12 GB RAM。
结果
下图显示了结果。水平轴是输出数组的元素数,这是比 N
更好的大小度量。
还使用 逻辑运算(而不是算术运算)进行了测试。为简洁起见,此处未显示结果,但显示了类似的趋势。
结论
根据图表:
- 结果证实,对于小型数组,隐式扩展速度更快,对于大型数组,其速度与
bsxfun
相似。
- 沿第一个维度或沿其他维度扩展似乎没有太大影响,至少在所考虑的情况下如此。
- 对于小型阵列,差异可能是十倍或更多。但是请注意,
timeit
对于小尺寸并不准确,因为代码速度太快(实际上,它会针对如此小的尺寸发出警告)。
- 当输出的元素个数达到约
1e5
时,两个速度相等。此值可能取决于系统。
由于速度提升仅在数组较小时才显着,在这种情况下,任何一种方法都非常快,使用隐式扩展或 bsxfun
似乎主要是品味问题,可读性, 或向后兼容性。
基准代码
clear
% NxN, Nx1, addition / power
N1 = 2.^(4:1:12);
t1_bsxfun_add = NaN(size(N1));
t1_implicit_add = NaN(size(N1));
t1_bsxfun_pow = NaN(size(N1));
t1_implicit_pow = NaN(size(N1));
for k = 1:numel(N1)
N = N1(k);
x = randn(N,N);
y = randn(N,1);
% y = randn(1,N); % use this line or the preceding one
t1_bsxfun_add(k) = timeit(@() bsxfun(@plus, x, y));
t1_implicit_add(k) = timeit(@() x+y);
t1_bsxfun_pow(k) = timeit(@() bsxfun(@power, x, y));
t1_implicit_pow(k) = timeit(@() x.^y);
end
% NxNxNxN, Nx1xN, addition / power
N2 = round(sqrt(N1));
t2_bsxfun_add = NaN(size(N2));
t2_implicit_add = NaN(size(N2));
t2_bsxfun_pow = NaN(size(N2));
t2_implicit_pow = NaN(size(N2));
for k = 1:numel(N1)
N = N2(k);
x = randn(N,N,N,N);
y = randn(N,1,N);
% y = randn(1,N,N); % use this line or the preceding one
t2_bsxfun_add(k) = timeit(@() bsxfun(@plus, x, y));
t2_implicit_add(k) = timeit(@() x+y);
t2_bsxfun_pow(k) = timeit(@() bsxfun(@power, x, y));
t2_implicit_pow(k) = timeit(@() x.^y);
end
% Plots
figure
colors = get(gca,'ColorOrder');
subplot(121)
title('N\times{}N, N\times{}1')
% title('N\times{}N, 1\times{}N') % this or the preceding
set(gca,'XScale', 'log', 'YScale', 'log')
hold on
grid on
loglog(N1.^2, t1_bsxfun_add, 's-', 'color', colors(1,:))
loglog(N1.^2, t1_implicit_add, 's-', 'color', colors(2,:))
loglog(N1.^2, t1_bsxfun_pow, '^-', 'color', colors(1,:))
loglog(N1.^2, t1_implicit_pow, '^-', 'color', colors(2,:))
legend('Addition, bsxfun', 'Addition, implicit', 'Power, bsxfun', 'Power, implicit')
subplot(122)
title('N\times{}N\times{}N{}\times{}N, N\times{}1\times{}N')
% title('N\times{}N\times{}N{}\times{}N, 1\times{}N\times{}N') % this or the preceding
set(gca,'XScale', 'log', 'YScale', 'log')
hold on
grid on
loglog(N2.^4, t2_bsxfun_add, 's-', 'color', colors(1,:))
loglog(N2.^4, t2_implicit_add, 's-', 'color', colors(2,:))
loglog(N2.^4, t2_bsxfun_pow, '^-', 'color', colors(1,:))
loglog(N2.^4, t2_implicit_pow, '^-', 'color', colors(2,:))
legend('Addition, bsxfun', 'Addition, implicit', 'Power, bsxfun', 'Power, implicit')
与 commented by Steve Eddins, implicit expansion (introduced in Matlab R2016b) is faster than bsxfun
对于小数组大小一样,对于大数组具有相似的速度:
In R2016b, implicit expansion works as fast or faster than bsxfun in most cases. The best performance gains for implicit expansion are with small matrix and array sizes. For large matrix sizes, implicit expansion tends to be roughly the same speed as
bsxfun
.
此外,扩展发生的维度可能会产生影响:
When there is an expansion in the first dimension, the operators might not be quite as fast as
bsxfun
.
(感谢@Poelie and @rayryeng for letting me
自然会出现两个问题:
- 与
bsxfun
相比,隐式扩展快多少? - 对于哪些阵列大小或形状,差异显着?
为了衡量速度上的差异,已经进行了一些测试。测试考虑 两种不同的操作:
- 加法
- 功率
和四种不同形状的待操作数组:
N×N
数组与N×1
数组N×N×N×N
数组与N×1×N
数组N×N
数组与1×N
数组N×N×N×N
数组与1×N×N
数组
对于运算和数组形状的八种组合中的每一种,使用隐式扩展和 bsxfun
完成相同的运算。 使用了N
的几个值,涵盖了从小数组到大数组的范围。 timeit
用于可靠计时。
基准代码在这个答案的末尾给出。在 Matlab R2016b 上 运行,Windows 10,12 GB RAM。
结果
下图显示了结果。水平轴是输出数组的元素数,这是比 N
更好的大小度量。
还使用 逻辑运算(而不是算术运算)进行了测试。为简洁起见,此处未显示结果,但显示了类似的趋势。
结论
根据图表:
- 结果证实,对于小型数组,隐式扩展速度更快,对于大型数组,其速度与
bsxfun
相似。 - 沿第一个维度或沿其他维度扩展似乎没有太大影响,至少在所考虑的情况下如此。
- 对于小型阵列,差异可能是十倍或更多。但是请注意,
timeit
对于小尺寸并不准确,因为代码速度太快(实际上,它会针对如此小的尺寸发出警告)。 - 当输出的元素个数达到约
1e5
时,两个速度相等。此值可能取决于系统。
由于速度提升仅在数组较小时才显着,在这种情况下,任何一种方法都非常快,使用隐式扩展或 bsxfun
似乎主要是品味问题,可读性, 或向后兼容性。
基准代码
clear
% NxN, Nx1, addition / power
N1 = 2.^(4:1:12);
t1_bsxfun_add = NaN(size(N1));
t1_implicit_add = NaN(size(N1));
t1_bsxfun_pow = NaN(size(N1));
t1_implicit_pow = NaN(size(N1));
for k = 1:numel(N1)
N = N1(k);
x = randn(N,N);
y = randn(N,1);
% y = randn(1,N); % use this line or the preceding one
t1_bsxfun_add(k) = timeit(@() bsxfun(@plus, x, y));
t1_implicit_add(k) = timeit(@() x+y);
t1_bsxfun_pow(k) = timeit(@() bsxfun(@power, x, y));
t1_implicit_pow(k) = timeit(@() x.^y);
end
% NxNxNxN, Nx1xN, addition / power
N2 = round(sqrt(N1));
t2_bsxfun_add = NaN(size(N2));
t2_implicit_add = NaN(size(N2));
t2_bsxfun_pow = NaN(size(N2));
t2_implicit_pow = NaN(size(N2));
for k = 1:numel(N1)
N = N2(k);
x = randn(N,N,N,N);
y = randn(N,1,N);
% y = randn(1,N,N); % use this line or the preceding one
t2_bsxfun_add(k) = timeit(@() bsxfun(@plus, x, y));
t2_implicit_add(k) = timeit(@() x+y);
t2_bsxfun_pow(k) = timeit(@() bsxfun(@power, x, y));
t2_implicit_pow(k) = timeit(@() x.^y);
end
% Plots
figure
colors = get(gca,'ColorOrder');
subplot(121)
title('N\times{}N, N\times{}1')
% title('N\times{}N, 1\times{}N') % this or the preceding
set(gca,'XScale', 'log', 'YScale', 'log')
hold on
grid on
loglog(N1.^2, t1_bsxfun_add, 's-', 'color', colors(1,:))
loglog(N1.^2, t1_implicit_add, 's-', 'color', colors(2,:))
loglog(N1.^2, t1_bsxfun_pow, '^-', 'color', colors(1,:))
loglog(N1.^2, t1_implicit_pow, '^-', 'color', colors(2,:))
legend('Addition, bsxfun', 'Addition, implicit', 'Power, bsxfun', 'Power, implicit')
subplot(122)
title('N\times{}N\times{}N{}\times{}N, N\times{}1\times{}N')
% title('N\times{}N\times{}N{}\times{}N, 1\times{}N\times{}N') % this or the preceding
set(gca,'XScale', 'log', 'YScale', 'log')
hold on
grid on
loglog(N2.^4, t2_bsxfun_add, 's-', 'color', colors(1,:))
loglog(N2.^4, t2_implicit_add, 's-', 'color', colors(2,:))
loglog(N2.^4, t2_bsxfun_pow, '^-', 'color', colors(1,:))
loglog(N2.^4, t2_implicit_pow, '^-', 'color', colors(2,:))
legend('Addition, bsxfun', 'Addition, implicit', 'Power, bsxfun', 'Power, implicit')