Pytorch:如何针对多重损失优化多个变量?
Pytorch: How to optimize multiple variables with respect to multiple losses?
我希望根据不同的变量计算不同的损失的梯度,然后将这些变量全部结合起来。
这是一个演示我想要的东西的简单示例:
import torch as T
x = T.randn(3, requires_grad = True)
y = T.randn(4, requires_grad = True)
z = T.randn(5, requires_grad = True)
x_opt = T.optim.Adadelta([x])
y_opt = T.optim.Adadelta([y])
z_opt = T.optim.Adadelta([z])
for i in range(n_iter):
x_opt.zero_grad()
y_opt.zero_grad()
z_opt.zero_grad()
shared_computation = foobar(x, y, z)
x_loss = f(x, y, z, shared_computation)
y_loss = g(x, y, z, shared_computation)
z_loss = h(x, y, z, shared_computation)
x_loss.backward_with_respect_to(x)
y_loss.backward_with_respect_to(y)
z_loss.backward_with_respect_to(z)
x_opt.step()
y_opt.step()
z_opt.step()
我的问题是我们如何在 PyTorch 中完成 backward_with_respect_to
部分?我只想要 x
的渐变 w.r.t。 x_loss
,等等。然后我希望所有优化器一起执行(基于 x
、y
和 z
的当前值)。
我已经编写了一个函数来执行此操作。两个关键组成部分是 (1) 除了对 .backward()
的最终调用之外的所有调用都使用 retain_graph=True
和 (2) 在每次调用 .backward()
后保存梯度,并在 .backward()
之前的最后恢复它们=15=]ing.
def multi_step(losses, optms):
# optimizers each take a step, with `optms[i]`'s variables being
# optimized w.r.t. `losses[i]`.
grads = [None]*len(losses)
for i, (loss, optm) in enumerate(zip(losses, optms)):
retain_graph = i != (len(losses)-1)
optm.zero_grad()
loss.backward(retain_graph=retain_graph)
grads[i] = [
[
p.grad+0 for p in group['params']
] for group in optm.param_groups
]
for optm, grad in zip(optms, grads):
for p_group, g_group in zip(optm.param_groups, grad):
for p, g in zip(p_group['params'], g_group):
p.grad = g
optm.step()
在问题中陈述的示例代码中,multi_step
将按如下方式使用:
for i in range(n_iter):
shared_computation = foobar(x, y, z)
x_loss = f(x, y, z, shared_computation)
y_loss = g(x, y, z, shared_computation)
z_loss = h(x, y, z, shared_computation)
multi_step([x_loss, y_loss, z_loss], [x_opt, y_opt, z_opt])
我希望根据不同的变量计算不同的损失的梯度,然后将这些变量全部结合起来。
这是一个演示我想要的东西的简单示例:
import torch as T
x = T.randn(3, requires_grad = True)
y = T.randn(4, requires_grad = True)
z = T.randn(5, requires_grad = True)
x_opt = T.optim.Adadelta([x])
y_opt = T.optim.Adadelta([y])
z_opt = T.optim.Adadelta([z])
for i in range(n_iter):
x_opt.zero_grad()
y_opt.zero_grad()
z_opt.zero_grad()
shared_computation = foobar(x, y, z)
x_loss = f(x, y, z, shared_computation)
y_loss = g(x, y, z, shared_computation)
z_loss = h(x, y, z, shared_computation)
x_loss.backward_with_respect_to(x)
y_loss.backward_with_respect_to(y)
z_loss.backward_with_respect_to(z)
x_opt.step()
y_opt.step()
z_opt.step()
我的问题是我们如何在 PyTorch 中完成 backward_with_respect_to
部分?我只想要 x
的渐变 w.r.t。 x_loss
,等等。然后我希望所有优化器一起执行(基于 x
、y
和 z
的当前值)。
我已经编写了一个函数来执行此操作。两个关键组成部分是 (1) 除了对 .backward()
的最终调用之外的所有调用都使用 retain_graph=True
和 (2) 在每次调用 .backward()
后保存梯度,并在 .backward()
之前的最后恢复它们=15=]ing.
def multi_step(losses, optms):
# optimizers each take a step, with `optms[i]`'s variables being
# optimized w.r.t. `losses[i]`.
grads = [None]*len(losses)
for i, (loss, optm) in enumerate(zip(losses, optms)):
retain_graph = i != (len(losses)-1)
optm.zero_grad()
loss.backward(retain_graph=retain_graph)
grads[i] = [
[
p.grad+0 for p in group['params']
] for group in optm.param_groups
]
for optm, grad in zip(optms, grads):
for p_group, g_group in zip(optm.param_groups, grad):
for p, g in zip(p_group['params'], g_group):
p.grad = g
optm.step()
在问题中陈述的示例代码中,multi_step
将按如下方式使用:
for i in range(n_iter):
shared_computation = foobar(x, y, z)
x_loss = f(x, y, z, shared_computation)
y_loss = g(x, y, z, shared_computation)
z_loss = h(x, y, z, shared_computation)
multi_step([x_loss, y_loss, z_loss], [x_opt, y_opt, z_opt])