为什么我的模型在第二个时期过度拟合?
Why is my model overfitting on the second epoch?
我是深度学习的初学者,我正在尝试训练深度学习模型以使用 Mobilenet_v2 和 Inception 对不同的美国手语手势进行分类。
下面是我的代码,它创建了一个 ImageDataGenerator 来创建训练集和验证集。
# Reformat Images and Create Batches
IMAGE_RES = 224
BATCH_SIZE = 32
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
validation_split = 0.4
)
train_generator = datagen.flow_from_directory(
base_dir,
target_size = (IMAGE_RES,IMAGE_RES),
batch_size = BATCH_SIZE,
subset = 'training'
)
val_generator = datagen.flow_from_directory(
base_dir,
target_size= (IMAGE_RES, IMAGE_RES),
batch_size = BATCH_SIZE,
subset = 'validation'
)
以下是训练模型的代码:
# Do transfer learning with Tensorflow Hub
URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
feature_extractor = hub.KerasLayer(URL,
input_shape=(IMAGE_RES, IMAGE_RES, 3))
# Freeze pre-trained model
feature_extractor.trainable = False
# Attach a classification head
model = tf.keras.Sequential([
feature_extractor,
layers.Dense(5, activation='softmax')
])
model.summary()
# Train the model
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
EPOCHS = 5
history = model.fit(train_generator,
steps_per_epoch=len(train_generator),
epochs=EPOCHS,
validation_data = val_generator,
validation_steps=len(val_generator)
)
Epoch 1/5
94/94 [==============================] - 19s 199ms/step - loss: 0.7333 - accuracy: 0.7730 - val_loss: 0.6276 - val_accuracy: 0.7705
Epoch 2/5
94/94 [==============================] - 18s 190ms/step - loss: 0.1574 - accuracy: 0.9893 - val_loss: 0.5118 - val_accuracy: 0.8145
Epoch 3/5
94/94 [==============================] - 18s 191ms/step - loss: 0.0783 - accuracy: 0.9980 - val_loss: 0.4850 - val_accuracy: 0.8235
Epoch 4/5
94/94 [==============================] - 18s 196ms/step - loss: 0.0492 - accuracy: 0.9997 - val_loss: 0.4541 - val_accuracy: 0.8395
Epoch 5/5
94/94 [==============================] - 18s 193ms/step - loss: 0.0349 - accuracy: 0.9997 - val_loss: 0.4590 - val_accuracy: 0.8365
我试过使用数据增强,但模型仍然过拟合,所以我想知道我的代码是否做错了什么。
您的数据量很小。尝试使用随机种子拆分并检查问题是否仍然存在。
如果是,则使用正则化并降低神经网络的复杂性。
同时尝试不同的优化器和较小的学习率(尝试 lr 调度器)
您的数据集似乎很小,一些真实的输出仅被输入输出曲线中的一小段输入分隔开。这就是为什么它很容易符合这些点。
我是深度学习的初学者,我正在尝试训练深度学习模型以使用 Mobilenet_v2 和 Inception 对不同的美国手语手势进行分类。
下面是我的代码,它创建了一个 ImageDataGenerator 来创建训练集和验证集。
# Reformat Images and Create Batches
IMAGE_RES = 224
BATCH_SIZE = 32
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
validation_split = 0.4
)
train_generator = datagen.flow_from_directory(
base_dir,
target_size = (IMAGE_RES,IMAGE_RES),
batch_size = BATCH_SIZE,
subset = 'training'
)
val_generator = datagen.flow_from_directory(
base_dir,
target_size= (IMAGE_RES, IMAGE_RES),
batch_size = BATCH_SIZE,
subset = 'validation'
)
以下是训练模型的代码:
# Do transfer learning with Tensorflow Hub
URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
feature_extractor = hub.KerasLayer(URL,
input_shape=(IMAGE_RES, IMAGE_RES, 3))
# Freeze pre-trained model
feature_extractor.trainable = False
# Attach a classification head
model = tf.keras.Sequential([
feature_extractor,
layers.Dense(5, activation='softmax')
])
model.summary()
# Train the model
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
EPOCHS = 5
history = model.fit(train_generator,
steps_per_epoch=len(train_generator),
epochs=EPOCHS,
validation_data = val_generator,
validation_steps=len(val_generator)
)
Epoch 1/5 94/94 [==============================] - 19s 199ms/step - loss: 0.7333 - accuracy: 0.7730 - val_loss: 0.6276 - val_accuracy: 0.7705
Epoch 2/5 94/94 [==============================] - 18s 190ms/step - loss: 0.1574 - accuracy: 0.9893 - val_loss: 0.5118 - val_accuracy: 0.8145
Epoch 3/5 94/94 [==============================] - 18s 191ms/step - loss: 0.0783 - accuracy: 0.9980 - val_loss: 0.4850 - val_accuracy: 0.8235
Epoch 4/5 94/94 [==============================] - 18s 196ms/step - loss: 0.0492 - accuracy: 0.9997 - val_loss: 0.4541 - val_accuracy: 0.8395
Epoch 5/5 94/94 [==============================] - 18s 193ms/step - loss: 0.0349 - accuracy: 0.9997 - val_loss: 0.4590 - val_accuracy: 0.8365
我试过使用数据增强,但模型仍然过拟合,所以我想知道我的代码是否做错了什么。
您的数据量很小。尝试使用随机种子拆分并检查问题是否仍然存在。
如果是,则使用正则化并降低神经网络的复杂性。
同时尝试不同的优化器和较小的学习率(尝试 lr 调度器)
您的数据集似乎很小,一些真实的输出仅被输入输出曲线中的一小段输入分隔开。这就是为什么它很容易符合这些点。