为什么 torch.autograd.grad() returns None 和 torch.cat？

Question

我知道 torch.autograd.grad() returns None 如果渐变以某种方式停止，但是，我想知道以下代码片段有什么问题？

x = torch.rand(6, requires_grad=True)

y = x.pow(2).sum()
z = torch.cat([x])

grad1 = torch.autograd.grad(y, x, allow_unused=True)
grad2 = torch.autograd.grad(y, z, allow_unused=True)
      
print(f'grad1 = {grad1}, grad = {grad2}')

输出为grad1 = (tensor([0.3705, 0.7468, 0.6102, 1.8640, 0.3518, 0.5397]),), grad = (None,)。我希望 grad2 与 grad1 相同，因为 z 本质上是 x。我可以知道为什么吗？

更新：阅读 post 和@Ivan 的帮助后，我得出结论，原因是 x 是 y 的叶节点，但 z 不是任何更多。 x是计算图中y和z的叶子节点，但是z到y没有直接路径，所以torch.autograd.grad returns None.

注意：返回值None并不一定保证为0。

Answer 1

Tensor z 未用于计算 y 的值，因此它未连接到其计算图，并且您不会在 z 上获得梯度，因为它没有连接到 y。

另一方面，以下将起作用：

>>> y = x.pow(2).sum()
>>> torch.autograd.grad(y, x, allow_unused=True)
(tensor([0.3134, 1.6802, 0.1989, 0.8495, 1.9203, 1.0905]),)

>>> z = torch.cat([x])
>>> y = z.pow(2).sum()
>>> torch.autograd.grad(y, z, allow_unused=True)
(tensor([0.3134, 1.6802, 0.1989, 0.8495, 1.9203, 1.0905]),)

为什么 torch.autograd.grad() returns None 和 torch.cat？

Why torch.autograd.grad() returns None with torch.cat?

pytorch

autograd