numpy 广播 - 尾随轴的解释

Question

问题

请详细说明2012年Numpy array broadcasting rules的答案，并说明trailing axes是什么，我不太清楚答案指的是哪个“链接文档页面”。也许在过去的 8 年里它发生了变化。

由于 axes in trailing axes 是复数，至少最后两个轴的大小必须匹配（单数除外）？如果是，为什么至少有两个？

给出的答案是：

Well, the meaning of trailing axes is explained on the linked documentation page. If you have two arrays with different dimensions number, say one 1x2x3 and other 2x3, then you compare only the trailing common dimensions, in this case 2x3. But if both your arrays are two-dimensional, then their corresponding sizes have to be either equal or one of them has to be 1.

In your case you have a 2x2 and 4x2 and 4 != 2 and neither 4 or 2 equals 1, so this doesn't work.

错误和提出的问题是：

A = np.array([[1,2],[3,4]])
B = np.array([[2,3],[4,6],[6,9],[8,12]])
print("A.shape {}".format(A.shape))
print("B.shape {}".format(B.shape))
A*B
---
A.shape (2, 2)              # <---- The last axis size is 2 in both shapes.
B.shape (4, 2)              # <---- Apparently this "2" is not the size of trailing axis/axes

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-91-7a3f7e97944d> in <module>
      3 print("A.shape {}".format(A.shape))
      4 print("B.shape {}".format(B.shape))
----> 5 A*B

ValueError: operands could not be broadcast together with shapes (2,2) (4,2) 


Since both A and B have two columns, I would have thought this would work. 
So, I'm probably misunderstanding something here about the term "trailing axis", 
and how it applies to N-dimensional arrays.

参考资料

Array Broadcasting in Numpy

The Broadcasting Rule
In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or one of them must be one.

Broadcasting

更新

根据@Akshay Sehgal 的回复理解。考虑 2 个数组 A.shape = (4,5,1) 和 B.shape = (1,2).

A = np.arange(20).reshape((4, 5, 1))
B = np.arange(2).reshape((1,2))
print("A.shape {}".format(A.shape))
print("B.shape {}".format(B.shape))
---
A.shape (4, 5, 1)
B.shape (1, 2)

首先看axis=-1，A中的shape 01是从01到02广播的，因为它是单数，所以要匹配B的shape。然后B中的shape 01 for axis=-2是从广播的01（单数）到 05 以匹配 A 的结果。结果是形状 (4, 5, 2)。

print("A * B shape is {}".format((A*B).shape))
---
A * B shape is (4, 5, 2)

根据@hpaulj的回答，模拟广播的方法

print("A.shape {}".format(A.shape))
print("B.shape {}".format(B.shape))
---
A.shape (4, 5, 1)
B.shape (1, 2)

# Check ranks.
print("rank(A) {} rank(B) {}".format(A.ndim, B.ndim))
---
rank(A) 3 rank(B) 2

# Expand B because rank(B) < rank(A).
B = B[
    None,
    ::
]
B.shape
---
(1, 1, 2)

A:(4,5,1)
   ↑ ↑ ↓
B:(1,1,2)
----------
C:(4,5,2)

Answer 1

尾随轴是 axis=-1, axis=-2, axis=-3 ...。广播规则比较尾随轴与 leading 轴（axis=0 之后）。

这是专门用于将广播应用于不同维度的张量（比如 2D 和 3D 张量）。 Trailing axes 基本上表示广播规则考虑轴的方向。想象一下按形状排列轴。如果你 lead 使用轴你会得到类似下面的东西 -

考虑 2 个数组 A.shape = (4,5,1) 和 B.shape = (1,2)

#Leading axes

A  04  05  01
B  01  02
--------------
No broadcasting
--------------

要考虑尾随轴，您可以将它们视为 -

#Trailing axes

A  04  05  01
B      01  02
--------------
C  04  05  02
--------------

这就是术语 trailing axes 在这种情况下的全部含义，即向后而不是引导轴开始。

换句话说，当考虑用更高维数组广播 (1,2) 形状的数组时，我们查看形状为 2 for axis=-1 的尾轴，然后是 1 for axis=-2相反的顺序。

Answer 2

我解释广播的方式较少关注尾随轴，更多关注两个规则：

通过添加前导尺寸 1 尺寸来匹配尺寸数量
缩放所有大小为 1 的尺寸以匹配

在那个例子中，向下配对：

In [233]: A = np.arange(20).reshape((4, 5))
     ...: B = np.arange(2)
In [234]: A
Out[234]: 
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
In [235]: B
Out[235]: array([0, 1])
In [236]: A*B
Traceback (most recent call last):
  File "<ipython-input-236-47896efed660>", line 1, in <module>
    A*B
ValueError: operands could not be broadcast together with shapes (4,5) (2,)

根据第一条规则，(2,) 被扩展为 (1,2)，并可能扩展为 (4,2)，但这是一个死胡同。

但是如果我们给 A 添加一个维度，使它成为 (4,5,1):

In [237]: A[:,:,None]*B
Out[237]: 
array([[[ 0,  0],
        [ 0,  1],
        [ 0,  2],
        ...
        [ 0, 19]]])
In [238]: _.shape
Out[238]: (4, 5, 2)

现在 (2,) 扩展为 (1,1,2)，与 (4,5,1)

从 (1,2) 开始对于 B 也适用：

In [240]: (A[:,:,None]*B[None,:]).shape
Out[240]: (4, 5, 2)

它可以根据需要向 B 添加任意数量的前导维度，但不能自动向 A 添加尾随维度。我们必须自己做。 reshape 可以很好地添加维度，但我认为 None/newaxis 习语更好地突出了这一添加。

这种行为可以用拖尾轴（不必是复数）来解释，但我认为两步解释更清楚。

我认为，引导轴和尾随轴之间的区别有两个原因。引导轴在最外面（至少对于 C 顺序），它避免了歧义。

考虑同时使用 (3,) 和 (2,)。我们可以从中形成 (3,2) 或 (2,3) 数组，但是哪个？

In [241]: np.array([1,2,3])*np.array([4,5])
Traceback (most recent call last):
  File "<ipython-input-241-eaf3e99b50a9>", line 1, in <module>
    np.array([1,2,3])*np.array([4,5])
ValueError: operands could not be broadcast together with shapes (3,) (2,) 

In [242]: np.array([1,2,3])[:,None]*np.array([4,5])
Out[242]: 
array([[ 4,  5],
       [ 8, 10],
       [12, 15]])

In [243]: np.array([1,2,3])*np.array([4,5])[:,None]
Out[243]: 
array([[ 4,  8, 12],
       [ 5, 10, 15]])

明确的尾随 None 清楚地标识了我们想要的。我们可以添加一个 [None,:] 但这不是必需的。

numpy 广播 - 尾随轴的解释

numpy broadcasting - explanation of trailing axes

python

numpy

array-broadcasting

问题

参考资料

更新