为什么 torch.Tensor 减法在张量大小不同时效果很好？

Question

这个例子会更容易理解。以下失败：

A = tensor.torch([[1, 2, 3], [4, 5, 6]])   # shape : (2, 3)
B = tensor.torch([[1, 2], [3, 4], [5, 6]]) # shape : (3, 2)
print((A - B).shape)

# RuntimeError: The size of tensor A (3) must match the size of tensor B (2) at non-singleton dimension 1
# ==================================================================
A = tensor.torch([[1, 2], [3, 4], [5, 6]])   # shape : (3, 2)
B = tensor.torch([[1, 2], [3, 4],]) # shape : (2, 2)
print((A - B).shape)

# RuntimeError: The size of tensor A (3) must match the size of tensor B (2) at non-singleton dimension 0

但下面的效果很好：

a = torch.ones(8).unsqueeze(0).unsqueeze(-1).expand(4, 8, 7) 
a_temp = a.unsqueeze(2)                            # shape : ( 4, 8, 1, 7 )
b_temp = torch.transpose(a_temp, 1, 2)             # shape : ( 4, 1, 8, 7 )
print(a_temp-b_temp)                               # shape : ( 4, 8, 8, 7 )

为什么后者有效，而前者无效？
How/why结果形状是否展开了？

Answer 1

broadcasting semantics很好地解释了这一点。重要的部分是：

如果满足以下规则，则两个张量是“可广播的”：

每个张量至少有一个维度。
迭代维度大小时，从尾随维度开始，维度大小必须相等，其中之一为 1，或者其中之一不存在。

在你的情况下，(3,2) 和 (2,3) 不能广播到一个共同的形状（3 != 2 并且都不等于 1），但是 (4,8,1,7) , (4,1,8,7) 和 (4,8,8,7) 是广播兼容的。

这基本上就是错误状态：所有维度必须相等（“匹配”）或单例（即等于 1）

广播形状时发生的事情基本上是tensor expansion使形状匹配（扩展到[4,8,8,7]），然后像往常一样执行减法。扩展会（以一种巧妙的方式）复制您的数据以达到所需的形状。

为什么 torch.Tensor 减法在张量大小不同时效果很好？

why torch.Tensor subtract works well when tensor size is different?

torch

pytorch

tensor