TensorFlow 中的这种单热编码速度快吗？或出于任何原因有缺陷？

Question

有一些关于使用 TensorFlow 计算单热嵌入的堆栈溢出问题，这里是公认的解决方案：

num_labels = 10
sparse_labels = tf.reshape(label_batch, [-1, 1])
derived_size = tf.shape(label_batch)[0]
indices = tf.reshape(tf.range(0, derived_size, 1), [-1, 1])
concated = tf.concat(1, [indices, sparse_labels])
outshape = tf.reshape(tf.concat(0, [derived_size, [num_labels]]), [-1])
labels = tf.sparse_to_dense(concated, outshape, 1.0, 0.0)

这与官方教程中的代码几乎相同：https://www.tensorflow.org/versions/0.6.0/tutorials/mnist/tf/index.html

对我来说似乎因为 tf.nn.embedding_lookup 存在，它可能更有效率。这是一个使用它的版本，它支持任意形状的输入：

def one_hot(inputs, num_classes):
    with tf.device('/cpu:0'):
        table = tf.constant(np.identity(num_classes, dtype=np.float32))
        embeddings = tf.nn.embedding_lookup(table, inputs)
    return embeddings

您希望此实施速度更快吗？它是否因任何其他原因而存在缺陷？

Answer 1

您问题中的 one_hot() 函数看起来是正确的。但是，我们不建议以这种方式编写代码的原因是它 内存效率非常低 。为理解原因，假设您的批处理大小为 32 和 1,000,000 类.

在教程中建议的版本中，最大张量将是 tf.sparse_to_dense() 的结果，即 32 x 1000000.
在问题的one_hot()函数中，最大张量将是np.identity(1000000)的结果，也就是4TB。当然，分配这个张量可能不会成功。即使类的数量少得多，显式存储所有这些零仍然会浪费内存——TensorFlow 不会自动将您的数据转换为稀疏表示，即使这样做可能有利可图。

最后，我想为最近添加到开源存储库的新功能提供一个插件，并将在下一个版本中提供。 tf.nn.sparse_softmax_cross_entropy_with_logits() 允许您指定一个整数向量作为标签，并使您不必构建密集的单热表示。对于大量类.

，either 解决方案应该更有效

TensorFlow 中的这种单热编码速度快吗？或出于任何原因有缺陷？

Is this one-hot encoding in TensorFlow fast? Or flawed for any reason?

tensorflow