与 bsxfun 相比，隐式扩展快多少？

Question

与 commented by Steve Eddins, implicit expansion (introduced in Matlab R2016b) is faster than bsxfun 对于小数组大小一样，对于大数组具有相似的速度：

In R2016b, implicit expansion works as fast or faster than bsxfun in most cases. The best performance gains for implicit expansion are with small matrix and array sizes. For large matrix sizes, implicit expansion tends to be roughly the same speed as bsxfun.

此外，扩展发生的维度可能会产生影响：

When there is an expansion in the first dimension, the operators might not be quite as fast as bsxfun.

（感谢@Poelie and @rayryeng for letting me 这个！）

自然会出现两个问题：

与bsxfun相比，隐式扩展快多少？
对于哪些阵列大小或形状，差异显着？

Answer 1

为了衡量速度上的差异，已经进行了一些测试。测试考虑 两种不同的操作:

加法
功率

和四种不同形状的待操作数组：

N×N 数组与 N×1 数组
N×N×N×N 数组与 N×1×N 数组
N×N 数组与 1×N 数组
N×N×N×N 数组与 1×N×N 数组

对于运算和数组形状的八种组合中的每一种，使用隐式扩展和 bsxfun 完成相同的运算。 使用了N的几个值，涵盖了从小数组到大数组的范围。 timeit 用于可靠计时。

基准代码在这个答案的末尾给出。在 Matlab R2016b 上运行，Windows 10，12 GB RAM。

结果

下图显示了结果。水平轴是输出数组的元素数，这是比 N 更好的大小度量。

还使用 逻辑运算（而不是算术运算）进行了测试。为简洁起见，此处未显示结果，但显示了类似的趋势。

结论

根据图表：

结果证实，对于小型数组，隐式扩展速度更快，对于大型数组，其速度与 bsxfun 相似。
沿第一个维度或沿其他维度扩展似乎没有太大影响，至少在所考虑的情况下如此。
对于小型阵列，差异可能是十倍或更多。但是请注意，timeit 对于小尺寸并不准确，因为代码速度太快（实际上，它会针对如此小的尺寸发出警告）。
当输出的元素个数达到约1e5时，两个速度相等。此值可能取决于系统。

由于速度提升仅在数组较小时才显着，在这种情况下，任何一种方法都非常快，使用隐式扩展或 bsxfun 似乎主要是品味问题，可读性, 或向后兼容性。

基准代码

clear

% NxN, Nx1, addition / power
N1 = 2.^(4:1:12);
t1_bsxfun_add = NaN(size(N1));
t1_implicit_add = NaN(size(N1));
t1_bsxfun_pow = NaN(size(N1));
t1_implicit_pow = NaN(size(N1));
for k = 1:numel(N1)
    N = N1(k);
    x = randn(N,N);
    y = randn(N,1);
    % y = randn(1,N); % use this line or the preceding one
    t1_bsxfun_add(k) = timeit(@() bsxfun(@plus, x, y));
    t1_implicit_add(k) = timeit(@() x+y);
    t1_bsxfun_pow(k) = timeit(@() bsxfun(@power, x, y));
    t1_implicit_pow(k) = timeit(@() x.^y);
end

% NxNxNxN, Nx1xN, addition / power
N2 = round(sqrt(N1));
t2_bsxfun_add = NaN(size(N2));
t2_implicit_add = NaN(size(N2));
t2_bsxfun_pow = NaN(size(N2));
t2_implicit_pow = NaN(size(N2));
for k = 1:numel(N1)
    N = N2(k);
    x = randn(N,N,N,N);
    y = randn(N,1,N);
    % y = randn(1,N,N); % use this line or the preceding one
    t2_bsxfun_add(k) = timeit(@() bsxfun(@plus, x, y));
    t2_implicit_add(k) = timeit(@() x+y);
    t2_bsxfun_pow(k) = timeit(@() bsxfun(@power, x, y));
    t2_implicit_pow(k) = timeit(@() x.^y);
end

% Plots
figure
colors = get(gca,'ColorOrder');

subplot(121)
title('N\times{}N,   N\times{}1')
% title('N\times{}N,   1\times{}N') % this or the preceding
set(gca,'XScale', 'log', 'YScale', 'log')
hold on
grid on
loglog(N1.^2, t1_bsxfun_add, 's-', 'color', colors(1,:))
loglog(N1.^2, t1_implicit_add, 's-', 'color', colors(2,:))
loglog(N1.^2, t1_bsxfun_pow, '^-', 'color', colors(1,:))
loglog(N1.^2, t1_implicit_pow, '^-', 'color', colors(2,:))
legend('Addition, bsxfun', 'Addition, implicit', 'Power, bsxfun', 'Power, implicit')

subplot(122)
title('N\times{}N\times{}N{}\times{}N,   N\times{}1\times{}N')
% title('N\times{}N\times{}N{}\times{}N,   1\times{}N\times{}N') % this or the preceding
set(gca,'XScale', 'log', 'YScale', 'log')
hold on
grid on
loglog(N2.^4, t2_bsxfun_add, 's-', 'color', colors(1,:))
loglog(N2.^4, t2_implicit_add, 's-', 'color', colors(2,:))
loglog(N2.^4, t2_bsxfun_pow, '^-', 'color', colors(1,:))
loglog(N2.^4, t2_implicit_pow, '^-', 'color', colors(2,:))
legend('Addition, bsxfun', 'Addition, implicit', 'Power, bsxfun', 'Power, implicit')

与 bsxfun 相比，隐式扩展快多少？

How much faster is implicit expansion compared with bsxfun?

arrays

performance

matlab

bsxfun

结果

结论

基准代码