使用一个热向量作为训练标签时出错

Error when using one hot vectors as labels for training

我正在研究一个二元分类问题(Kaggle 上臭名昭著的泰坦尼克号示例),我制作了一个 MLP 网络,其中 输出层大小为 2,softmax 激活 。我正在使用形状为 (number_of_example, ) 的标签向量并且它正在工作(即没有给出任何错误),但我意识到它实际上应该不起作用。实际上,标签为 0 或 1 如何与 softmax 向量进行比较?即,如果一个示例具有标签 1(即“幸存”)并且网络输出 (0.21 0.79) 如何使用这两个值来计算损失 ?

因此,我将 y 向量替换为其等效的一次性向量,但随后出现此错误:

InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [10,2] and labels shape [20]

titanic_all = pd.read_csv("https://raw.githubusercontent.com/aymeric75/IA/master/train.csv")
titanic_all = titanic_all.replace({'male': 1, 'female': 0})


features = ["Age", "Fare", "Sex", "Pclass", "SibSp"]
X = titanic_all[features].copy()
y = titanic_all.pop('Survived')

print(y.shape)

# Making labels, one hot vectors (so they can be compared with the output layer (2 nodes))
y = keras.utils.to_categorical(y, num_classes=2)

Xtrans = imputer(X, 'ascending', 10)

train_X, val_X, train_y, val_y = train_test_split(Xtrans, y, random_state=1)




##################
# Model construct
##################


norm1 = preprocessing.Normalization()
norm1.adapt(np.array(train_X))
inputs = keras.Input(shape=(5,))
x = norm1(inputs)


for i in range(15):
    x = layers.Dense(units=64, activation="relu")(x)

outputs = layers.Dense(2, activation="softmax", kernel_initializer=initializer)(x)
model = keras.Model(inputs, outputs)

opt = keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt, loss="sparse_categorical_crossentropy", metrics=["accuracy"])



##################
# Model training
##################

history = model.fit(
    train_X,
    train_y,
    #validation_split=0.1,
    batch_size=10,
    epochs=20
)

您可以找到代码here并修改它。

感谢您的帮助!

如果您的标签是 one-hot - 那么您必须使用 categorical_crossentropy:

model.compile(optimizer=opt, loss="categorical_crossentropy", metrics=["accuracy"])

看这里:https://keras.io/api/losses/probabilistic_losses/#sparsecategoricalcrossentropy-class