一个热编码标签和分层 K 折交叉验证

One Hot Encoded Labels and Stratified K-Fold Cross Validation

我正在尝试在我的 ResNet-50 模型上实施分层 K 折交叉验证。 不幸的是,当我对标签进行单热编码并尝试使用分层 k 折叠拆分数据时,出现此错误: TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

one-hot encoder的实现是这样的:

from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
enc.fit(Y)
Y = enc.transform(Y)
Y.toarray()

如果我不对标签进行一次性编码,我会在拟合模型时遇到此错误: ValueError: Shapes (None, 1) and (None, 4) are incompatible

这是实现分层 K 折的代码:

for train, test in skf.split(X, Y):
  x = base_model.output
  x = GlobalAveragePooling2D()(x)
  x = Dropout(0.7)(x)
  predictions = Dense(num_classes, activation= 'softmax')(x)
  model = Model(inputs = base_model.input, outputs = predictions)
  adam = Adam(lr=0.0001)
  model.compile(optimizer= adam, loss='categorical_crossentropy', metrics=['accuracy'])

  # Training
  history = model.fit(X[train], Y[train], epochs = 100, batch_size = 16)

其中 num_classes = 4.

所以我的问题是:

Q: How do I get one-hot encoded labels to work with skf.split()?

如果问题是模型不接受 sparse matrix 作为使用 OneHotEncoding 的结果(这是一般预期的行为),您可以尝试通过设置其参数 [=12= 来更改 OneHotEncoding 模型].

from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(sparse=False)
enc.fit(Y)
Y = enc.transform(Y)