为什么 autograd 不为中间变量产生梯度?
Why does autograd not produce gradient for intermediate variables?
尝试了解渐变的表示方式以及 autograd 的工作方式:
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
z.backward()
print(x.grad)
#Variable containing:
#32
#[torch.FloatTensor of size 1]
print(y.grad)
#None
为什么它不为 y
生成渐变?如果 y.grad = dz/dy
,那么它不应该至少产生一个像 y.grad = 2*y
这样的变量吗?
By default, gradients are only retained for leaf variables. non-leaf variables' gradients are not retained to be inspected later. This was
done by design, to save memory.
-soumith chintala
参见:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94
选项 1:
致电y.retain_grad()
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.retain_grad()
z.backward()
print(y.grad)
#Variable containing:
# 8
#[torch.FloatTensor of size 1]
来源:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16
选项 2:
注册一个hook
,它基本上是一个在计算梯度时调用的函数。然后你可以保存它,分配它,打印它,随便什么......
from __future__ import print_function
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.register_hook(print) ## this can be anything you need it to be
z.backward()
输出:
Variable containing: 8 [torch.FloatTensor of size 1
来源:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2
另见:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7
尝试了解渐变的表示方式以及 autograd 的工作方式:
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
z.backward()
print(x.grad)
#Variable containing:
#32
#[torch.FloatTensor of size 1]
print(y.grad)
#None
为什么它不为 y
生成渐变?如果 y.grad = dz/dy
,那么它不应该至少产生一个像 y.grad = 2*y
这样的变量吗?
By default, gradients are only retained for leaf variables. non-leaf variables' gradients are not retained to be inspected later. This was done by design, to save memory.
-soumith chintala
参见:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94
选项 1:
致电y.retain_grad()
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.retain_grad()
z.backward()
print(y.grad)
#Variable containing:
# 8
#[torch.FloatTensor of size 1]
来源:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16
选项 2:
注册一个hook
,它基本上是一个在计算梯度时调用的函数。然后你可以保存它,分配它,打印它,随便什么......
from __future__ import print_function
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.register_hook(print) ## this can be anything you need it to be
z.backward()
输出:
Variable containing: 8 [torch.FloatTensor of size 1
来源:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2
另见:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7