卷积的tensorflow教程，logit的尺度

Question

我正在尝试通过向 cifar10.py 添加一些代码来编辑我自己的模型，这就是问题所在。

在 cifar10.py 中，[教程][1] 说：

EXERCISE: The output of inference are un-normalized logits. Try editing the network architecture to return normalized predictions using tf.nn.softmax().

所以我直接把"local4"的输出输入到tf.nn.softmax()。这给了我 scaled logits 这意味着所有 logits 的总和是 1.

但是在损失函数中，cifar10.py代码使用：

tf.nn.sparse_softmax_cross_entropy_with_logits()

这个函数的描述是

WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.

此外，根据描述，作为上述函数输入的 logits 必须具有 [batch_size, num_classes] 的形状，这意味着 logits 应该是未缩放的 softmax，如示例代码计算未归一化的 softmaxlogit如下

  # softmax, i.e. softmax(WX + b)
  with tf.variable_scope('softmax_linear') as scope:
    weights = _variable_with_weight_decay('weights', [192, NUM_CLASSES],
                                          stddev=1/192.0, wd=0.0)
    biases = _variable_on_cpu('biases', [NUM_CLASSES],
                              tf.constant_initializer(0.0))
    softmax_linear = tf.add(tf.matmul(local4, weights), biases, name=scope.name)
    _activation_summary(softmax_linear)

这是否意味着我不必在代码中使用 tf.nn.softmax？

Answer 1

如果你愿意，你可以在代码中使用tf.nn.softmax，但是你必须自己计算损失：

softmax_logits = tf.nn.softmax(logits)
loss = tf.reduce_mean(- labels * tf.log(softmax_logits) - (1. - labels) * tf.log(1. - softmax_logits))

实际上，您不会使用 tf.nn.softmax 来计算损失。但是，如果您想要计算算法的预测并将它们与真实标签进行比较（以计算准确性），则需要使用 tf.nn.softmax。

卷积的tensorflow教程，logit的尺度

tensorflow tutorial of convolution, scale of logit

convolution

tensorflow