尽管对稀疏目标使用稀疏分类熵，但 Logits 和标签必须具有相同的第一维误差

Question

这些是我的特征和目标变量的形状。

(1382, 1785, 2) (1382, 2)

这里的目标有两个标签，每个标签都有相同的28个类。我有一个 CNN 网络如下：-

model.add(Conv1D(100,5, activation='relu', input_shape=(1785,2)))
model.add(MaxPooling1D(pool_size=5))
model.add(Conv1D(64,10, activation='relu'))
model.add(MaxPooling1D(pool_size=4))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(28, activation='softmax'))

当我使用一个热编码目标 (1382,28) 和分类交叉熵损失函数时，模型运行很好并且没有错误。

但是当我使用稀疏目标(1382,2)和稀疏分类交叉熵损失函数时，我运行出现以下错误。

logits and labels must have the same first dimension, got logits shape [20,28] and labels shape [40]
 [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at \AppData\Local\Temp/ipykernel_9932/3729291395.py:1) ]] [Op:__inference_train_function_11741]

根据我从发布相同问题的人那里看到的情况，似乎对一个热编码目标变量使用稀疏分类交叉熵。

我认为批次的形状可能存在一些问题。 Logit 的形状变为 [x,28]，其中 x 是批量大小。另一件可能成为问题的事情是我有两个标签，但没有关于如何从那里解决问题的线索。

非常感谢任何帮助。

Answer 1

如果你使用SparseCategoricalCrossEntropy作为你的损失函数，你需要确保你的数据中的每个数据样本属于一个class，范围从0到27。例如：

samples = 25
labels = tf.random.uniform((25, ), maxval=28, dtype=tf.int32)
print(labels)

tf.Tensor(
[12  7  1 13 22 14 26 13  6  1 27  1 11 18  5 18  5  6 12 14 21 18 17 12
  5], shape=(25,), dtype=int32)

考虑 labels 的形状，它既不是 (25, 2) 也不是 (25, 28)，而是 (25,) 可以与 SparseCategoricalCrossEntropy 一起使用。

尽管对稀疏目标使用稀疏分类熵，但 Logits 和标签必须具有相同的第一维误差

Logits and labels must have the same first dimension error, despite using sparse catogorical entropy for sparse targets

python

conv-neural-network

keras

tensorflow

sparsecategoricalcrossentropy