如何对目录中的 keras 图像数据集使用交叉验证?
How to use cross-validation with keras image datasets from directories?
我在 keras 中有一个图像数据集,我直接从相应的函数在训练和测试之间单独加载:
from tensorflow import keras
tds = keras.preprocessing\
.image_dataset_from_directory('dataset_folder', seed=123,
validation_split=0.35, subset='training')
vds = keras.preprocessing\
.image_dataset_from_directory('dataset_folder', seed=123,
validation_split=0.35, subset='validation')
然后我会经历我的神经网络的通常阶段:
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
num_classes = 5
model = Sequential([
layers.experimental.preprocessing.Rescaling(1.0/255,
input_shape=(256, 256, 3)),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(num_classes)])
model\
.compile(optimizer='adam', metrics=['accuracy'],
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True))
hist = model.fit(tds, validation_data=vds, epochs=15)
如何在 sklearn.model_selection
中使用 KFold
或 StratifiedKFold
实施交叉验证?如果为了能够做到这一点,我必须改变数据的加载方式,我也很高兴知道如何去做。
查看这些关于在 Keras 中实施交叉验证的建议:
https://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/
使用 image_dataset_from_directory 加载数据将生成一个 tf.data.dataset 对象,我不确定它是否有助于上述实现。一种替代方法是将图像转换为 Numpy 数组,然后可以通过 K-fold 对其进行处理。为此,您可以参考以下内容:
注:上面给出的机器学习掌握link中提到了以下语句:
Cross validation is often not used for evaluating deep learning models because of the greater computational expense. For example k-fold cross validation is often used with 5 or 10 folds. As such, 5 or 10 models must be constructed and evaluated, greatly adding to the evaluation time of a model.
我在 keras 中有一个图像数据集,我直接从相应的函数在训练和测试之间单独加载:
from tensorflow import keras
tds = keras.preprocessing\
.image_dataset_from_directory('dataset_folder', seed=123,
validation_split=0.35, subset='training')
vds = keras.preprocessing\
.image_dataset_from_directory('dataset_folder', seed=123,
validation_split=0.35, subset='validation')
然后我会经历我的神经网络的通常阶段:
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
num_classes = 5
model = Sequential([
layers.experimental.preprocessing.Rescaling(1.0/255,
input_shape=(256, 256, 3)),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(num_classes)])
model\
.compile(optimizer='adam', metrics=['accuracy'],
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True))
hist = model.fit(tds, validation_data=vds, epochs=15)
如何在 sklearn.model_selection
中使用 KFold
或 StratifiedKFold
实施交叉验证?如果为了能够做到这一点,我必须改变数据的加载方式,我也很高兴知道如何去做。
查看这些关于在 Keras 中实施交叉验证的建议:
https://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/
使用 image_dataset_from_directory 加载数据将生成一个 tf.data.dataset 对象,我不确定它是否有助于上述实现。一种替代方法是将图像转换为 Numpy 数组,然后可以通过 K-fold 对其进行处理。为此,您可以参考以下内容:
注:上面给出的机器学习掌握link中提到了以下语句:
Cross validation is often not used for evaluating deep learning models because of the greater computational expense. For example k-fold cross validation is often used with 5 or 10 folds. As such, 5 or 10 models must be constructed and evaluated, greatly adding to the evaluation time of a model.