PyTorch 中的 autograd 微分示例 - 应该是 9/8?
autograd differentiation example in PyTorch - should be 9/8?
在example for the Torch tutorial for Python中,他们使用了下图:
x = [[1, 1], [1, 1]]
y = x + 2
z = 3y^2
o = mean( z ) # 1/4 * x.sum()
因此,前向传递让我们得到这个:
x_i = 1, y_i = 3, z_i = 27, o = 27
代码如下:
import torch
# define graph
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()
# if we don't do this, torch will only retain gradients for leaf nodes, ie: x
y.retain_grad()
z.retain_grad()
# does a forward pass
print(z, out)
然而,我对计算的梯度感到困惑:
# now let's run our backward prop & get gradients
out.backward()
print(f'do/dz = {z.grad[0,0]}')
输出:
do/dx = 4.5
通过链式规则,do/dx = do/dz * dz/dy * dy/dx
,其中:
dy/dx = 1
dz/dy = 9/2 given x_i=1
do/dz = 1/4 given x_i=1
这意味着:
do/dx = 1/4 * 9/2 * 1 = 9/8
然而,这与 Torch 返回的梯度 (9/2 = 4.5) 不匹配。也许我有数学错误(do/dz = 1/4 项?),或者我不明白 Torch 中的 autograd
。
有什么指点吗?
do/dz = 1 / 4
dz/dy = 6y = 6 * 3 = 18
dy/dx = 1
因此,do/dx = 9/2
在example for the Torch tutorial for Python中,他们使用了下图:
x = [[1, 1], [1, 1]]
y = x + 2
z = 3y^2
o = mean( z ) # 1/4 * x.sum()
因此,前向传递让我们得到这个:
x_i = 1, y_i = 3, z_i = 27, o = 27
代码如下:
import torch
# define graph
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()
# if we don't do this, torch will only retain gradients for leaf nodes, ie: x
y.retain_grad()
z.retain_grad()
# does a forward pass
print(z, out)
然而,我对计算的梯度感到困惑:
# now let's run our backward prop & get gradients
out.backward()
print(f'do/dz = {z.grad[0,0]}')
输出:
do/dx = 4.5
通过链式规则,do/dx = do/dz * dz/dy * dy/dx
,其中:
dy/dx = 1
dz/dy = 9/2 given x_i=1
do/dz = 1/4 given x_i=1
这意味着:
do/dx = 1/4 * 9/2 * 1 = 9/8
然而,这与 Torch 返回的梯度 (9/2 = 4.5) 不匹配。也许我有数学错误(do/dz = 1/4 项?),或者我不明白 Torch 中的 autograd
。
有什么指点吗?
do/dz = 1 / 4
dz/dy = 6y = 6 * 3 = 18
dy/dx = 1
因此,do/dx = 9/2