pyTorch 可以在不设置 retain_graph=True 的情况下向后两次
pyTorch can backward twice without setting retain_graph=True
所示
if you even want to do the backward on some part of the graph twice,
you need to pass in retain_graph = True during the first pass.
但是,我发现下面的代码片断在没有这样做的情况下确实有效。我正在使用 pyTorch-0.4
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
y.backward(torch.ones(2, 2)) # Note I do not set retain_graph=True
y.backward(torch.ones(2, 2)) # But it can still work!
print x.grad
输出:
tensor([[ 2., 2.],
[ 2., 2.]])
谁能解释一下?提前致谢!
在你的情况下它起作用的原因 w/o retain_graph=True
是你有非常简单的图形,可能没有内部中间缓冲区,反过来不会释放缓冲区,所以不需要使用 retain_graph=True
.
但是,当向您的图表添加一个额外的计算时,一切都在改变:
代码:
x = torch.ones(2, 2, requires_grad=True)
v = x.pow(3)
y = v + 2
y.backward(torch.ones(2, 2))
print('Backward 1st time w/o retain')
print('x.grad:', x.grad)
print('Backward 2nd time w/o retain')
try:
y.backward(torch.ones(2, 2))
except RuntimeError as err:
print(err)
print('x.grad:', x.grad)
输出:
Backward 1st time w/o retain
x.grad: tensor([[3., 3.],
[3., 3.]])
Backward 2nd time w/o retain
Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
x.grad: tensor([[3., 3.],
[3., 3.]]).
在这种情况下,将计算额外的内部 v.grad
,但 torch
不存储中间值(中间梯度等),并且 retain_graph=False
v.grad
将在第一个 backward
.
之后被释放
所以,如果你想第二次反向传播,你需要指定 retain_graph=True
到 "keep" 图表。
代码:
x = torch.ones(2, 2, requires_grad=True)
v = x.pow(3)
y = v + 2
y.backward(torch.ones(2, 2), retain_graph=True)
print('Backward 1st time w/ retain')
print('x.grad:', x.grad)
print('Backward 2nd time w/ retain')
try:
y.backward(torch.ones(2, 2))
except RuntimeError as err:
print(err)
print('x.grad:', x.grad)
输出:
Backward 1st time w/ retain
x.grad: tensor([[3., 3.],
[3., 3.]])
Backward 2nd time w/ retain
x.grad: tensor([[6., 6.],
[6., 6.]])
if you even want to do the backward on some part of the graph twice, you need to pass in retain_graph = True during the first pass.
但是,我发现下面的代码片断在没有这样做的情况下确实有效。我正在使用 pyTorch-0.4
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
y.backward(torch.ones(2, 2)) # Note I do not set retain_graph=True
y.backward(torch.ones(2, 2)) # But it can still work!
print x.grad
输出:
tensor([[ 2., 2.],
[ 2., 2.]])
谁能解释一下?提前致谢!
在你的情况下它起作用的原因 w/o retain_graph=True
是你有非常简单的图形,可能没有内部中间缓冲区,反过来不会释放缓冲区,所以不需要使用 retain_graph=True
.
但是,当向您的图表添加一个额外的计算时,一切都在改变:
代码:
x = torch.ones(2, 2, requires_grad=True)
v = x.pow(3)
y = v + 2
y.backward(torch.ones(2, 2))
print('Backward 1st time w/o retain')
print('x.grad:', x.grad)
print('Backward 2nd time w/o retain')
try:
y.backward(torch.ones(2, 2))
except RuntimeError as err:
print(err)
print('x.grad:', x.grad)
输出:
Backward 1st time w/o retain
x.grad: tensor([[3., 3.],
[3., 3.]])
Backward 2nd time w/o retain
Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
x.grad: tensor([[3., 3.],
[3., 3.]]).
在这种情况下,将计算额外的内部 v.grad
,但 torch
不存储中间值(中间梯度等),并且 retain_graph=False
v.grad
将在第一个 backward
.
所以,如果你想第二次反向传播,你需要指定 retain_graph=True
到 "keep" 图表。
代码:
x = torch.ones(2, 2, requires_grad=True)
v = x.pow(3)
y = v + 2
y.backward(torch.ones(2, 2), retain_graph=True)
print('Backward 1st time w/ retain')
print('x.grad:', x.grad)
print('Backward 2nd time w/ retain')
try:
y.backward(torch.ones(2, 2))
except RuntimeError as err:
print(err)
print('x.grad:', x.grad)
输出:
Backward 1st time w/ retain
x.grad: tensor([[3., 3.],
[3., 3.]])
Backward 2nd time w/ retain
x.grad: tensor([[6., 6.],
[6., 6.]])