Tf 2.0 : RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes
Tf 2.0 : RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes
在tensorflow 2.0 guide中的tf 2.0 DC Gan示例中,有两个渐变带。见下文。
@tf.function
def train_step(images):
noise = tf.random.normal([BATCH_SIZE, noise_dim])
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
generated_images = generator(noise, training=True)
real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)
gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
你可以清楚地看到有两条渐变带。我想知道使用单个磁带有什么区别并将其更改为以下
@tf.function
def train_step(images):
noise = tf.random.normal([BATCH_SIZE, noise_dim])
with tf.GradientTape() as tape:
generated_images = generator(noise, training=True)
real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)
gradients_of_generator = tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = tape.gradient(disc_loss, discriminator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
这给了我以下错误:
RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes.
我想知道为什么需要两盘磁带。
截至目前,关于 tf2.0 API 的文档还很少。任何人都可以解释或指向右边 docs/tutorials 吗?
来自GradientTape
的documentation:
By default, the resources held by a GradientTape are released as soon as GradientTape.gradient() method is called. To compute multiple gradients over the same computation, create a persistent gradient tape. This allows multiple calls to the gradient() method as resources are released when the tape object is garbage collected.
可以使用 with tf.GradientTape(persistent=True) as tape
创建持久渐变,并使用 del tape
手动删除 can/should(感谢此@zwep,@Crispy13)。
技术原因是 gradient
被调用了两次,这在 (non-persistent) 个磁带上是不允许的。
然而,在目前的情况下,根本原因是 GANS 的训练通常是通过交替优化生成器和鉴别器来完成的。每个优化都有自己的优化器,通常对不同的变量进行操作,现在即使是最小化的损失也不同(代码中的 gen_loss
和 disc_loss
)。
所以你最终得到两个梯度,因为训练 GAN 本质上是以交替方式优化两个不同的(对抗性)问题。
在tensorflow 2.0 guide中的tf 2.0 DC Gan示例中,有两个渐变带。见下文。
@tf.function
def train_step(images):
noise = tf.random.normal([BATCH_SIZE, noise_dim])
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
generated_images = generator(noise, training=True)
real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)
gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
你可以清楚地看到有两条渐变带。我想知道使用单个磁带有什么区别并将其更改为以下
@tf.function
def train_step(images):
noise = tf.random.normal([BATCH_SIZE, noise_dim])
with tf.GradientTape() as tape:
generated_images = generator(noise, training=True)
real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)
gradients_of_generator = tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = tape.gradient(disc_loss, discriminator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
这给了我以下错误:
RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes.
我想知道为什么需要两盘磁带。 截至目前,关于 tf2.0 API 的文档还很少。任何人都可以解释或指向右边 docs/tutorials 吗?
来自GradientTape
的documentation:
By default, the resources held by a GradientTape are released as soon as GradientTape.gradient() method is called. To compute multiple gradients over the same computation, create a persistent gradient tape. This allows multiple calls to the gradient() method as resources are released when the tape object is garbage collected.
可以使用 with tf.GradientTape(persistent=True) as tape
创建持久渐变,并使用 del tape
手动删除 can/should(感谢此@zwep,@Crispy13)。
技术原因是 gradient
被调用了两次,这在 (non-persistent) 个磁带上是不允许的。
然而,在目前的情况下,根本原因是 GANS 的训练通常是通过交替优化生成器和鉴别器来完成的。每个优化都有自己的优化器,通常对不同的变量进行操作,现在即使是最小化的损失也不同(代码中的 gen_loss
和 disc_loss
)。
所以你最终得到两个梯度,因为训练 GAN 本质上是以交替方式优化两个不同的(对抗性)问题。