如何提高 CNN 模型的验证准确性
How to increase the validation accuracy in CNN model
我想建立一个 CNN 模型来区分正常人脸和唐氏综合症人脸,然后用另一个模型对性别进行分类。我试图改变层数、节点、时期、优化器。另外,我尝试了彩色图像和灰度图像。数据集是 799 张图像,包括正常人和唐氏综合症患者。
这是我的代码
model.add(Conv2D(filters=16, kernel_size=(5,5), activation="relu",
input_shape=X_train[0].shape))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.3))
model.add(Conv2D(filters=64, kernel_size=(5,5), activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.3))
model.add(Conv2D(filters=64, kernel_size=(5,5), activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
#Two dense layers and then output layer
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5)) #Using dropouts to make sure that
#the model isn't overfitting
model.add(Dense(128, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
我尝试将最后一个激活层从 softmax 更改为 sigmoid,反之亦然,但没有成功。输入图像的大小为 200x200
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_16 (Conv2D) (None, 196, 196, 16) 416
batch_normalization_24 (Bat (None, 196, 196, 16) 64
chNormalization)
max_pooling2d_16 (MaxPoolin (None, 98, 98, 16) 0
g2D)
dropout_24 (Dropout) (None, 98, 98, 16) 0
conv2d_17 (Conv2D) (None, 94, 94, 32) 12832
batch_normalization_25 (Bat (None, 94, 94, 32) 128
chNormalization)
max_pooling2d_17 (MaxPoolin (None, 47, 47, 32) 0
g2D)
dropout_25 (Dropout) (None, 47, 47, 32) 0
conv2d_18 (Conv2D) (None, 43, 43, 64) 51264
batch_normalization_26 (Bat (None, 43, 43, 64) 256
chNormalization)
max_pooling2d_18 (MaxPoolin (None, 21, 21, 64) 0
g2D)
dropout_26 (Dropout) (None, 21, 21, 64) 0
conv2d_19 (Conv2D) (None, 17, 17, 64) 102464
batch_normalization_27 (Bat (None, 17, 17, 64) 256
chNormalization)
max_pooling2d_19 (MaxPoolin (None, 8, 8, 64) 0
g2D)
dropout_27 (Dropout) (None, 8, 8, 64) 0
flatten_4 (Flatten) (None, 4096) 0
dense_12 (Dense) (None, 256) 1048832
batch_normalization_28 (Bat (None, 256) 1024
chNormalization)
dropout_28 (Dropout) (None, 256) 0
dense_13 (Dense) (None, 128) 32896
batch_normalization_29 (Bat (None, 128) 512
chNormalization)
dropout_29 (Dropout) (None, 128) 0
dense_14 (Dense) (None, 2) 258
=================================================================
Total params: 1,251,202
Trainable params: 1,250,082
Non-trainable params: 1,120
_________________________________________________________________
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
# split train and VALID data
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.15)
我想将准确率至少提高到 70,但我达到的最高分数是 47%
history = model.fit(X_train, y_train, epochs=50, validation_data=(X_valid, y_valid), batch_size=64)
Epoch 1/50
5/5 [==============================] - 23s 4s/step - loss: 0.9838 - accuracy: 0.5390 - val_loss: 0.6931 - val_accuracy: 0.4800
Epoch 2/50
5/5 [==============================] - 21s 4s/step - loss: 0.8043 - accuracy: 0.6348 - val_loss: 0.7109 - val_accuracy: 0.4800
Epoch 3/50
5/5 [==============================] - 21s 4s/step - loss: 0.6745 - accuracy: 0.6915 - val_loss: 0.7554 - val_accuracy: 0.4800
Epoch 4/50
5/5 [==============================] - 21s 4s/step - loss: 0.6429 - accuracy: 0.7589 - val_loss: 0.8261 - val_accuracy: 0.4800
Epoch 5/50
5/5 [==============================] - 21s 4s/step - loss: 0.5571 - accuracy: 0.8014 - val_loss: 0.9878 - val_accuracy: 0.4800
有什么方法可以增加更多吗?以及如何结合两个模型?
任何帮助将不胜感激。非常感谢。
尝试图像增强。
我是说;很明显模型过度拟合数据
甚至可以改变 train_test_split
比率(增加它。)
我认为发生了两件事之一。训练数据会指向过度拟合,但考虑到模型中的丢失量,我不会怀疑是这种情况。我认为可能是训练数据的概率分布与验证数据的概率分布明显不同。如果您需要很少的训练样本,就会发生这种情况。那么你的 2 classes 中的每一个有多少个训练样本?如果每个 class 少于 120 个样本,则使用图像增强来创建更多训练样本。你是如何生成验证图像的?如果您有单独的验证图像,最好将训练集与验证集结合起来,然后使用 sklearn train_test_split 将组合数据随机拆分为训练集和验证集。注意:仅对训练集而不是验证集使用扩充。我还建议您使用 Keras 回调 Reduce learning rate on plateau 来实现可调整的学习率。文档是 here. 下面的代码显示了我为此使用的设置
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5,
patience=1, verbose=1)
还建议使用 Keras 回调提前停止,文档是 here. 下面的代码显示了我对此的实现
estop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=3,
verbose=1,
restore_best_weights=True)
在model.fit中包含代码
history=model.fit(..... callbacks[estop, rlronp])
将纪元数设置为运行一个相当大的值。
我想建立一个 CNN 模型来区分正常人脸和唐氏综合症人脸,然后用另一个模型对性别进行分类。我试图改变层数、节点、时期、优化器。另外,我尝试了彩色图像和灰度图像。数据集是 799 张图像,包括正常人和唐氏综合症患者。 这是我的代码
model.add(Conv2D(filters=16, kernel_size=(5,5), activation="relu",
input_shape=X_train[0].shape))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.3))
model.add(Conv2D(filters=64, kernel_size=(5,5), activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.3))
model.add(Conv2D(filters=64, kernel_size=(5,5), activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
#Two dense layers and then output layer
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5)) #Using dropouts to make sure that
#the model isn't overfitting
model.add(Dense(128, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
我尝试将最后一个激活层从 softmax 更改为 sigmoid,反之亦然,但没有成功。输入图像的大小为 200x200
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_16 (Conv2D) (None, 196, 196, 16) 416
batch_normalization_24 (Bat (None, 196, 196, 16) 64
chNormalization)
max_pooling2d_16 (MaxPoolin (None, 98, 98, 16) 0
g2D)
dropout_24 (Dropout) (None, 98, 98, 16) 0
conv2d_17 (Conv2D) (None, 94, 94, 32) 12832
batch_normalization_25 (Bat (None, 94, 94, 32) 128
chNormalization)
max_pooling2d_17 (MaxPoolin (None, 47, 47, 32) 0
g2D)
dropout_25 (Dropout) (None, 47, 47, 32) 0
conv2d_18 (Conv2D) (None, 43, 43, 64) 51264
batch_normalization_26 (Bat (None, 43, 43, 64) 256
chNormalization)
max_pooling2d_18 (MaxPoolin (None, 21, 21, 64) 0
g2D)
dropout_26 (Dropout) (None, 21, 21, 64) 0
conv2d_19 (Conv2D) (None, 17, 17, 64) 102464
batch_normalization_27 (Bat (None, 17, 17, 64) 256
chNormalization)
max_pooling2d_19 (MaxPoolin (None, 8, 8, 64) 0
g2D)
dropout_27 (Dropout) (None, 8, 8, 64) 0
flatten_4 (Flatten) (None, 4096) 0
dense_12 (Dense) (None, 256) 1048832
batch_normalization_28 (Bat (None, 256) 1024
chNormalization)
dropout_28 (Dropout) (None, 256) 0
dense_13 (Dense) (None, 128) 32896
batch_normalization_29 (Bat (None, 128) 512
chNormalization)
dropout_29 (Dropout) (None, 128) 0
dense_14 (Dense) (None, 2) 258
=================================================================
Total params: 1,251,202
Trainable params: 1,250,082
Non-trainable params: 1,120
_________________________________________________________________
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
# split train and VALID data
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.15)
我想将准确率至少提高到 70,但我达到的最高分数是 47%
history = model.fit(X_train, y_train, epochs=50, validation_data=(X_valid, y_valid), batch_size=64)
Epoch 1/50
5/5 [==============================] - 23s 4s/step - loss: 0.9838 - accuracy: 0.5390 - val_loss: 0.6931 - val_accuracy: 0.4800
Epoch 2/50
5/5 [==============================] - 21s 4s/step - loss: 0.8043 - accuracy: 0.6348 - val_loss: 0.7109 - val_accuracy: 0.4800
Epoch 3/50
5/5 [==============================] - 21s 4s/step - loss: 0.6745 - accuracy: 0.6915 - val_loss: 0.7554 - val_accuracy: 0.4800
Epoch 4/50
5/5 [==============================] - 21s 4s/step - loss: 0.6429 - accuracy: 0.7589 - val_loss: 0.8261 - val_accuracy: 0.4800
Epoch 5/50
5/5 [==============================] - 21s 4s/step - loss: 0.5571 - accuracy: 0.8014 - val_loss: 0.9878 - val_accuracy: 0.4800
有什么方法可以增加更多吗?以及如何结合两个模型? 任何帮助将不胜感激。非常感谢。
尝试图像增强。 我是说;很明显模型过度拟合数据
甚至可以改变 train_test_split
比率(增加它。)
我认为发生了两件事之一。训练数据会指向过度拟合,但考虑到模型中的丢失量,我不会怀疑是这种情况。我认为可能是训练数据的概率分布与验证数据的概率分布明显不同。如果您需要很少的训练样本,就会发生这种情况。那么你的 2 classes 中的每一个有多少个训练样本?如果每个 class 少于 120 个样本,则使用图像增强来创建更多训练样本。你是如何生成验证图像的?如果您有单独的验证图像,最好将训练集与验证集结合起来,然后使用 sklearn train_test_split 将组合数据随机拆分为训练集和验证集。注意:仅对训练集而不是验证集使用扩充。我还建议您使用 Keras 回调 Reduce learning rate on plateau 来实现可调整的学习率。文档是 here. 下面的代码显示了我为此使用的设置
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5,
patience=1, verbose=1)
还建议使用 Keras 回调提前停止,文档是 here. 下面的代码显示了我对此的实现
estop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=3,
verbose=1,
restore_best_weights=True)
在model.fit中包含代码
history=model.fit(..... callbacks[estop, rlronp])
将纪元数设置为运行一个相当大的值。