将 2-d 矩阵的每一列乘以 3-d 矩阵的每个切片的更有效方法

Question

我有一个 8x8x25000 的数组 W 和一个 8 x 25000 的数组 r。我想将 W 的每个 8x8 切片乘以 r 的每一列 (8x1)，并将结果保存在 Wres 中，最终将成为一个 8x25000 矩阵。

我正在使用 for 循环完成此操作：

for i in range(0,25000):
    Wres[:,i] = np.matmul(W[:,:,i],res[:,i])

但这很慢，我希望有更快的方法来完成它。

有什么想法吗？

Answer 1

只要 2 个数组共享相同的 1 轴长度，Matmul 就可以传播。来自文档：

If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.

因此，您必须在 matmul 之前执行 2 个操作：

import numpy as np
a = np.random.rand(8,8,100)
b = np.random.rand(8, 100)

转置 a 和 b 以便第一个轴是 100 个切片
向 b 添加一个额外的维度，以便 b.shape = (100, 8, 1)

然后：

 at = a.transpose(2, 0, 1) # swap to shape 100, 8, 8
 bt = b.T[..., None] # swap to shape 100, 8, 1
 c = np.matmul(at, bt)

c 现在是 100, 8, 1，重塑回 8, 100:

 c = np.squeeze(c).swapaxes(0, 1)

或

 c = np.squeeze(c).T

最后，为了方便起见，单线：

c = np.squeeze(np.matmul(a.transpose(2, 0, 1), b.T[..., None])).T

Answer 2

使用 np.matmul 的替代方法是 np.einsum，它可以在 1 行更短且可以说更可口的代码中完成，无需方法链接。

示例数组：

np.random.seed(123)
w = np.random.rand(8,8,25000)
r = np.random.rand(8,25000)
wres = np.einsum('ijk,jk->ik',w,r)

# a quick check on result equivalency to your loop
print(np.allclose(np.matmul(w[:, :, 1], r[:, 1]), wres[:, 1]))
True

Timing 等同于@Imanol 的解决方案，因此请选择两者。两者都比循环快 30 倍。在这里，einsum 由于数组的大小而具有竞争力。对于比这些更大的阵列，它可能会胜出，而对于较小的阵列则可能会失败。有关更多信息，请参阅讨论。

def solution1():
    return np.einsum('ijk,jk->ik',w,r)

def solution2():
    return np.squeeze(np.matmul(w.transpose(2, 0, 1), r.T[..., None])).T

def solution3():
    Wres = np.empty((8, 25000))
    for i in range(0,25000):
        Wres[:,i] = np.matmul(w[:,:,i],r[:,i])
    return Wres

%timeit solution1()
100 loops, best of 3: 2.51 ms per loop

%timeit solution2()
100 loops, best of 3: 2.52 ms per loop

%timeit solution3()
10 loops, best of 3: 64.2 ms per loop

至：@Divakar

将 2-d 矩阵的每一列乘以 3-d 矩阵的每个切片的更有效方法

More efficient way to multiply each column of a 2-d matrix by each slice of a 3-d matrix

python

performance

numpy

matrix

linear-algebra