Tensorflow apply_gradients() 有多重损失

Tensorflow apply_gradients() with multiple losses

我正在训练一个具有中间输出的模型 (VAEGAN),我有两次损失,

我可以简单地总结它们并应用如下所示的渐变吗?

with tf.GradientTape() as tape:
    z_mean, z_log_sigma, z_encoder_output = self.encoder(real_images, training = True)
    kl_loss = self.kl_loss_fn(z_mean, z_log_sigma) * kl_loss_coeff

    fake_images = self.decoder(z_encoder_output)
    fake_inter_activations, logits_fake = self.discriminator(fake_images, training = True)
    real_inter_activations, logits_real = self.discriminator(real_images, training = True)

    rec_loss = self.rec_loss_fn(fake_inter_activations, real_inter_activations) * rec_loss_coeff

    total_encoder_loss = kl_loss + rec_loss

grads = tape.gradient(total_encoder_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads, self.encoder.trainable_weights))

或者我是否需要像下面那样将它们分开,同时保持磁带的持久性?

with tf.GradientTape(persistent = True) as tape:
    z_mean, z_log_sigma, z_encoder_output = self.encoder(real_images, training = True)
    kl_loss = self.kl_loss_fn(z_mean, z_log_sigma) * kl_loss_coeff
    
    fake_images = self.decoder(z_encoder_output)
    fake_inter_activations, logits_fake = self.discriminator(fake_images, training = True)
    real_inter_activations, logits_real = self.discriminator(real_images, training = True)
    
    rec_loss = self.rec_loss_fn(fake_inter_activations, real_inter_activations) * rec_loss_coeff

grads_kl_loss = tape.gradient(kl_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads_kl_loss, self.encoder.trainable_weights))

grads_rec_loss = tape.gradient(rec_loss, self.encoder.trainable_weights)
self.e_optimizer.apply_gradients(zip(grads_rec_loss, self.encoder.trainable_weights))

是的,您通常可以对损失求和并计算单个梯度。由于a sum的梯度是各自梯度的和,所以summed loss走的步和一步一步走两个步是一样的。

这是一个简单的例子:假设您有两个权重,并且您当前位于 (1, 3) 点(“起点”)。 loss 1的梯度为(2, -4),loss 2的梯度为(1, 2).

  • 如果一个接一个地应用这些步骤,您将首先移动到 (3, -1),然后移动到 (4, 1)。
  • 如果先对梯度求和,总步长为(3, -2)。从起点沿着这个方向也可以到达 (4, 1)。