训练 Keras 模型时使用稀疏数组表示标签

Question

我正在构建一个 Keras 模型，以 class 将数据分成 3000 种不同的 class，我的训练数据包含大量样本，因此在对训练输出进行编码后，在一个热编码中数据非常大（item_count * 3000 * 浮点大小 + 输入数据大小）是否可以将稀疏数组作为训练数据的输出传递给 keras，有任何建议的解决方案吗？

Answer 1

您可以使用 sparse_categorical_crossentropy 损失函数对基本事实进行稀疏表示。

# assuming get_model() returns your Keras model with an output_shape == [None, 3000]
# assuming get_data() returns training data, with y_train having shape == [num_samples]

x_train, y_train = get_data()
model = get_model()
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, batch_size=16)

训练 Keras 模型时使用稀疏数组表示标签

Using sparse array to represent labels when training Keras model

python

sparse-matrix

keras

tensorflow