TensorFlow：非单热向量的最佳方法？

Question

目前我有一个 CNN 预测 classes 用这样的单热 y_train 向量喂养：

[ 0., 0., 0., 0., 0., 1.0, 0., 0., 0., 0., 0. ]

所以np.argmax(y_train)会return5，因为这是正确的class。

不幸的是，我的网络训练数据要少得多才能class正确验证我的测试数据。随着时间的推移，训练错误会变得更好，但我认为这是过度拟合。测试错误总是很糟糕。

我认为正态分布会让网络的生活更轻松，所以我不会说 class 5 是唯一正确的 class 而所有其他 class 都是错误的，我如果它预测 class 0 而不是 5，我想给网络一个高惩罚，但如果它预测 4 而不是 5，我想给它一个小惩罚。我认为 y_train 用于向量编码 class 5 可以那么看起来像这样：

# values are rounded for displaying them in one line, original vector
# is always full tf.float32
[ 0., 0.,0.004, 0.054, 0.242, 0.399, 0.242, 0.054, 0.004, 0., 0., ]

问题是，哪种成本计算方法合适？

会：

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(y_net, y_train)

train_step = tf.train.AdamOptimizer(0.001).minimize(cross_entropy)

仍然会产生合理的结果，即使 y_train 向量不再是单热编码？

或者这是否需要更强大的架构更改？目前我有两个 Conv/Pool-Layers 和两个完全连接的。输出只是 y_net = tf.matmul(h_fc1_drop, W_fc2) + b_fc2，因为 tf.nn.softmax_cross_entropy_with_logits 应用了 softmax。

我的实际架构有 800 维输出向量的输出，因此目前训练数据很少，网络几乎从未设法在测试数据中正确命中 800 个中的一个 class，仅在训练数据（过度拟合）。

但是，如果网络预测 class 在 +/- 20 这样的范围内，我完全没问题。因此，如果 class 400 是正确的，则预测 class 380到420对我来说就够了。

Answer 1

根据文档，softmax_cross_entropy_with_logits 应该支持您的用例。

由于您可以在 +/- 20 的范围内进行预测并且不一定是准确的标签，那么更好的损失可能是合适的。例如，目标 value（例如 400）和您的预测（例如 420）之间的 MSE？这部分听起来更像是一个研究问题，而不是与 TensorFlow 相关。

Answer 2

However, I would be totally ok if the network predicts the class in a range like +/- 20. So if class 400 is the right one, predicting a class between 380 and 420 would be enough for me.

我认为这里要做的显而易见的事情是让你的标签更粗糙，也就是说，如果你预测深度值，你可以有 100 个不同的值而不是 800 个。

TensorFlow：非单热向量的最佳方法？

TensorFlow: best method for non-one-hot vectors?

statistics

machine-learning

neural-network

conv-neural-network

tensorflow