softmax的输出让二元交叉熵的输出NAN，怎么办？

Question

我在 Tensorflow 中实现了一个神经网络，其中最后一层是卷积层，我将该卷积层的输出输入到一个 softmax 激活函数中，然后我将它输入到一个交叉熵损失函数中，该函数定义为跟随标签，但问题是我将 NAN 作为损失函数的输出，我发现这是因为我在 softmax 的输出中有 1。所以，我的问题是在这种情况下我应该怎么做？我的输入是一张 16 x 16 的图像，其中我将 0 和 1 作为每个像素的值（二进制分类）

我的损失函数：

#Loss function
def loss(prediction, label):
    #with tf.variable_scope("Loss") as Loss_scope:
    log_pred = tf.log(prediction, name='Prediction_Log')
    log_pred_2 = tf.log(1-prediction, name='1-Prediction_Log')
    cross_entropy = -tf.multiply(label, log_pred) - tf.multiply((1-label), log_pred_2) 

    return cross_entropy

Answer 1

请注意 log(0) 未定义，因此如果 prediction==0 或 prediction==1 您将得到一个 NaN。

为了解决这个问题，通常在任何损失函数中将一个非常小的值 epsilon 添加到传递给 tf.log 的值（我们在除法时也做类似的事情以避免除以零）。这使得我们的损失函数在数值上稳定，并且 epsilon 值足够小，可以忽略不计它给我们的损失带来的任何不准确性。

也许可以试试：

#Loss function
def loss(prediction, label):
    #with tf.variable_scope("Loss") as Loss_scope:

    epsilon = tf.constant(0.000001)
    log_pred = tf.log(prediction + epsilon, name='Prediction_Log')
    log_pred_2 = tf.log(1-prediction + epsilon, name='1-Prediction_Log')

    cross_entropy = -tf.multiply(label, log_pred) - tf.multiply((1-label), log_pred_2) 
    return cross_entropy

更新：

正如 jdehesa 在他的评论中指出的那样 - 'out of the box' 损失函数已经很好地处理了数值稳定性问题

softmax的输出让二元交叉熵的输出NAN，怎么办？

The output of softmax makes the binary cross entropy's output NAN, what should I do?

python

conv-neural-network

tensorflow

softmax

cross-entropy