TensorFlow sequence_loss 与 label_smoothing

Question

是否可以将 tf.losses.softmax_cross_entropy 的 label_smoothing 功能与 tf.contrib.seq2seq.sequence_loss 一起使用？

我可以看到 sequence_loss 可以选择将 softmax_loss_function 作为参数。但是，此函数会将 targets 作为整数列表，而不是 tf.losses.softmax_cross_entropy 所需的单热编码向量，这也是 TensorFlow 中唯一支持 label_smoothing 的函数。

你能推荐一种让 label_smoothing 与 sequence_loss 一起工作的方法吗？

Answer 1

这无法有效地完成。

tf.contrib.seq2seq.sequence_loss is designed to work with very large vocabularies, hence it's expecting a loss function from sparse family (see 了解详情）。主要区别是labels使用ordinal编码，而不是one-hot，因为后者太占内存了。实际 one-hot 编码 从未计算过 。

另一方面，tf.losses.softmax_cross_entropy 的

label_smoothing 参数是操纵 one-hot 编码的选项。这是它的作用：

if label_smoothing > 0:
  num_classes = math_ops.cast(
      array_ops.shape(onehot_labels)[1], logits.dtype)
  smooth_positives = 1.0 - label_smoothing
  smooth_negatives = label_smoothing / num_classes
  onehot_labels = onehot_labels * smooth_positives + smooth_negatives

如您所见，要计算此张量，onehot_labels 必须显式存储，这正是稀疏函数试图避免的。这就是为什么 tf.nn.sparse_softmax_cross_entropy_with_logits 和 tf.contrib.seq2seq.sequence_loss 都没有提供类似参数的原因。当然，您可以自己进行转换，但这会破坏整个优化。

TensorFlow sequence_loss 与 label_smoothing

TensorFlow sequence_loss with label_smoothing

python

tensorflow

softmax

cross-entropy

sequence-to-sequence