GAN 生成器损失变为零
GAN generator loss goes to zero
我对深度学习比较陌生,请多多包涵。我有一个 GAN,模型结构复制粘贴自:https://machinelearningmastery.com/how-to-develop-a-generative-adversarial-network-for-an-mnist-handwritten-digits-from-scratch-in-keras/
它将训练 100-200 个 epoch,结果还不错,然后突然生成器损失降为零...这里是日志的摘录:
epoch,step,gen_loss,discr_loss
...
189,25,0.208,0.712
189,26,3.925,1.501
189,27,0.269,1.400
189,28,7.814,2.536
189,29,0.000,3.387 // here?!?
189,30,0.000,7.903
189,31,16.118,7.745
189,32,16.118,8.059
189,33,16.118,8.059
189,34,16.118,8.059
... etc, it never recovers
这是梯度消失的问题吗?还有什么我想念的吗?
博文评论里有人在争论GAN的崩溃问题,这里有评论:
There were problems with the discriminator collapsing to zero on occasions. This seems to be a known feature of GANs. Do any established GAN hacks help with this?
Looking at the discriminator after 100 epochs, it was in a confused state where everything passed into it was circa 50% probability real/fake. I colour coded some generated examples based on disriminator probability (red = fake, green = real, blue = unsure based on an arbitrary banding) and as you mentioned the subjective versus discriminator output does not always tie up. (example posted on linkedin). There was not enough spread in discriminator probability output to make this meaningful.
我对深度学习比较陌生,请多多包涵。我有一个 GAN,模型结构复制粘贴自:https://machinelearningmastery.com/how-to-develop-a-generative-adversarial-network-for-an-mnist-handwritten-digits-from-scratch-in-keras/
它将训练 100-200 个 epoch,结果还不错,然后突然生成器损失降为零...这里是日志的摘录:
epoch,step,gen_loss,discr_loss
...
189,25,0.208,0.712
189,26,3.925,1.501
189,27,0.269,1.400
189,28,7.814,2.536
189,29,0.000,3.387 // here?!?
189,30,0.000,7.903
189,31,16.118,7.745
189,32,16.118,8.059
189,33,16.118,8.059
189,34,16.118,8.059
... etc, it never recovers
这是梯度消失的问题吗?还有什么我想念的吗?
博文评论里有人在争论GAN的崩溃问题,这里有评论:
There were problems with the discriminator collapsing to zero on occasions. This seems to be a known feature of GANs. Do any established GAN hacks help with this? Looking at the discriminator after 100 epochs, it was in a confused state where everything passed into it was circa 50% probability real/fake. I colour coded some generated examples based on disriminator probability (red = fake, green = real, blue = unsure based on an arbitrary banding) and as you mentioned the subjective versus discriminator output does not always tie up. (example posted on linkedin). There was not enough spread in discriminator probability output to make this meaningful.