一个热编码标签和分层 K 折交叉验证
One Hot Encoded Labels and Stratified K-Fold Cross Validation
我正在尝试在我的 ResNet-50 模型上实施分层 K 折交叉验证。
不幸的是,当我对标签进行单热编码并尝试使用分层 k 折叠拆分数据时,出现此错误:
TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.
one-hot encoder的实现是这样的:
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
enc.fit(Y)
Y = enc.transform(Y)
Y.toarray()
如果我不对标签进行一次性编码,我会在拟合模型时遇到此错误:
ValueError: Shapes (None, 1) and (None, 4) are incompatible
这是实现分层 K 折的代码:
for train, test in skf.split(X, Y):
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dropout(0.7)(x)
predictions = Dense(num_classes, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)
adam = Adam(lr=0.0001)
model.compile(optimizer= adam, loss='categorical_crossentropy', metrics=['accuracy'])
# Training
history = model.fit(X[train], Y[train], epochs = 100, batch_size = 16)
其中 num_classes = 4.
所以我的问题是:
Q: How do I get one-hot encoded labels to work with skf.split()?
如果问题是模型不接受 sparse matrix
作为使用 OneHotEncoding 的结果(这是一般预期的行为),您可以尝试通过设置其参数 [=12= 来更改 OneHotEncoding 模型].
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(sparse=False)
enc.fit(Y)
Y = enc.transform(Y)
我正在尝试在我的 ResNet-50 模型上实施分层 K 折交叉验证。
不幸的是,当我对标签进行单热编码并尝试使用分层 k 折叠拆分数据时,出现此错误:
TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.
one-hot encoder的实现是这样的:
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
enc.fit(Y)
Y = enc.transform(Y)
Y.toarray()
如果我不对标签进行一次性编码,我会在拟合模型时遇到此错误:
ValueError: Shapes (None, 1) and (None, 4) are incompatible
这是实现分层 K 折的代码:
for train, test in skf.split(X, Y):
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dropout(0.7)(x)
predictions = Dense(num_classes, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)
adam = Adam(lr=0.0001)
model.compile(optimizer= adam, loss='categorical_crossentropy', metrics=['accuracy'])
# Training
history = model.fit(X[train], Y[train], epochs = 100, batch_size = 16)
其中 num_classes = 4.
所以我的问题是:
Q: How do I get one-hot encoded labels to work with skf.split()?
如果问题是模型不接受 sparse matrix
作为使用 OneHotEncoding 的结果(这是一般预期的行为),您可以尝试通过设置其参数 [=12= 来更改 OneHotEncoding 模型].
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(sparse=False)
enc.fit(Y)
Y = enc.transform(Y)