如何解决tensorflow 2.2.0中的CNN模型拟合问题?
How to solve CNN model fitting problem in tensorflow 2.2.0?
我想用图像数据训练 CNN 模型。我有 2 类(面具和不面具)。我使用以下代码导入和保存数据:
data_path='/train/'
categories=os.listdir(data_path)
labels=[i for i in range(len(categories))]
label_dict=dict(zip(categories,labels))
data=[]
target=[]
for category in categories:
folder_path=os.path.join(data_path,category)
img_names=os.listdir(folder_path)
for img_name in img_names:
img_path=os.path.join(folder_path,img_name)
img=cv2.imread(img_path)
try:
gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
resized=cv2.resize(gray,(500, 500))#dataset
data.append(resized)
target.append(label_dict[category])
except Exception as e:
print('Exception:',e)
data=np.array(data)/255.0
data=np.reshape(data,(data.shape[0],500, 500,1))
target=np.array(target)
new_target=np_utils.to_categorical(target)
#np.save('data',data)
#np.save('target',new_target)
我这样构建模型:
model=tf.keras.models.Sequential([
Conv2D(32, 1, activation='relu', input_shape=(500, 500, 1)),
MaxPooling2D(2,2),
Conv2D(64, 1, activation='relu'),
MaxPooling2D(2,2),
Conv2D(128, 1, padding='same', activation='relu'),
MaxPooling2D(2,2),
Flatten(),
Dropout(0.5),
Dense(256, activation='relu'),
Dense(2, activation='softmax') # dense layer has a shape of 2 as we have only 2 classes
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary 给我以下结果:
________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 500, 500, 32) 64
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 250, 250, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 250, 250, 64) 2112
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 125, 125, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 125, 125, 128) 8320
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 62, 62, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 492032) 0
_________________________________________________________________
dropout (Dropout) (None, 492032) 0
_________________________________________________________________
dense (Dense) (None, 256) 125960448
_________________________________________________________________
dense_1 (Dense) (None, 2) 514
=================================================================
Total params: 125,971,458
Trainable params: 125,971,458
Non-trainable params: 0
然后我适合模型但内核停止。我的试衣码是:
history=model.fit(data, target, epochs=10, batch_size=128, validation_data=data_val)
我的tensorflow版本是2.2.0。为什么 运行 我的模型没有?
您的内核似乎快死了(被杀死),因为线程占用了太多资源。似乎您通过添加太多连接和可训练参数来制作不必要的复杂模型。事实上,单个密集层实际上负责所有可训练参数的 99.991% (125,960,448 / 125,971,458)。
问题是您 运行 计算资源(主要是 RAM)不足。为了给您一个背景,以下是一些最有影响力的基于 CNN 的架构,其中大部分已经在强大的 GPU 上接受了 DAYS 的训练。
LeNet-5 - 60,000 parameters
AlexNet - 60M paramters
VGG-16 - 138M paramters
Inception-v1 - 5M parameters
Inception-v3 - 24M parameters
ResNet-50 - 26M parameters
Xception - 23M parameters
Inception-v4 - 43M parameters
Inception-ResNet-V2 - 56M parameters
ResNeXt-50 - 25M parameters
Your basic 2 CNN stack model - 125M parameters!
您可以执行以下操作 -
flatten (Flatten) (None, 492032) 0
_________________________________________________________________
dropout (Dropout) (None, 492032) 0
_________________________________________________________________
dense (Dense) (None, 256) 125960448 <---!!!!
_________________________________________________________________
您正在将 62x62x128 张量展平为 492,000 长度的向量!相反,要么尝试添加更多 CNN 以带来更易于管理的前 2 个 dims AND/OR 增加以前 CNN 中内核的大小。
这里的目标是在进入 Dense 层之前拥有一个可管理大小的张量。另外,尝试大幅减少密集层中的节点数。
对于初学者尝试这样的事情,你的设备可以在不杀死内核的情况下实际处理的事情,比如使用 68k 参数(你应该变得更简单,以后再增加复杂性。)
model=tf.keras.models.Sequential([
Conv2D(32, 3, activation='relu', input_shape=(500, 500, 1)),
MaxPooling2D(3,3),
Conv2D(64, 3, activation='relu'),
MaxPooling2D(3,3),
Conv2D(128, 3, padding='same', activation='relu'),
MaxPooling2D(3,3),
Conv2D(256, 3, padding='same', activation='relu'),
MaxPooling2D(3,3),
Flatten(),
Dropout(0.5),
Dense(32, activation='relu'),
Dense(2, activation='softmax') # dense layer has a shape of 2 as we have only 2 classes
])
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_19 (Conv2D) (None, 498, 498, 32) 320
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 166, 166, 32) 0
_________________________________________________________________
conv2d_20 (Conv2D) (None, 164, 164, 64) 18496
_________________________________________________________________
max_pooling2d_19 (MaxPooling (None, 54, 54, 64) 0
_________________________________________________________________
conv2d_21 (Conv2D) (None, 54, 54, 128) 73856
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 18, 18, 128) 0
_________________________________________________________________
conv2d_22 (Conv2D) (None, 18, 18, 256) 295168
_________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 6, 6, 256) 0
_________________________________________________________________
flatten_5 (Flatten) (None, 9216) 0
_________________________________________________________________
dropout_5 (Dropout) (None, 9216) 0
_________________________________________________________________
dense_10 (Dense) (None, 32) 294944
_________________________________________________________________
dense_11 (Dense) (None, 2) 66
=================================================================
Total params: 682,850
Trainable params: 682,850
Non-trainable params: 0
_________________________________________________________________
我想用图像数据训练 CNN 模型。我有 2 类(面具和不面具)。我使用以下代码导入和保存数据:
data_path='/train/'
categories=os.listdir(data_path)
labels=[i for i in range(len(categories))]
label_dict=dict(zip(categories,labels))
data=[]
target=[]
for category in categories:
folder_path=os.path.join(data_path,category)
img_names=os.listdir(folder_path)
for img_name in img_names:
img_path=os.path.join(folder_path,img_name)
img=cv2.imread(img_path)
try:
gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
resized=cv2.resize(gray,(500, 500))#dataset
data.append(resized)
target.append(label_dict[category])
except Exception as e:
print('Exception:',e)
data=np.array(data)/255.0
data=np.reshape(data,(data.shape[0],500, 500,1))
target=np.array(target)
new_target=np_utils.to_categorical(target)
#np.save('data',data)
#np.save('target',new_target)
我这样构建模型:
model=tf.keras.models.Sequential([
Conv2D(32, 1, activation='relu', input_shape=(500, 500, 1)),
MaxPooling2D(2,2),
Conv2D(64, 1, activation='relu'),
MaxPooling2D(2,2),
Conv2D(128, 1, padding='same', activation='relu'),
MaxPooling2D(2,2),
Flatten(),
Dropout(0.5),
Dense(256, activation='relu'),
Dense(2, activation='softmax') # dense layer has a shape of 2 as we have only 2 classes
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
model.summary 给我以下结果:
________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 500, 500, 32) 64
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 250, 250, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 250, 250, 64) 2112
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 125, 125, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 125, 125, 128) 8320
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 62, 62, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 492032) 0
_________________________________________________________________
dropout (Dropout) (None, 492032) 0
_________________________________________________________________
dense (Dense) (None, 256) 125960448
_________________________________________________________________
dense_1 (Dense) (None, 2) 514
=================================================================
Total params: 125,971,458
Trainable params: 125,971,458
Non-trainable params: 0
然后我适合模型但内核停止。我的试衣码是:
history=model.fit(data, target, epochs=10, batch_size=128, validation_data=data_val)
我的tensorflow版本是2.2.0。为什么 运行 我的模型没有?
您的内核似乎快死了(被杀死),因为线程占用了太多资源。似乎您通过添加太多连接和可训练参数来制作不必要的复杂模型。事实上,单个密集层实际上负责所有可训练参数的 99.991% (125,960,448 / 125,971,458)。
问题是您 运行 计算资源(主要是 RAM)不足。为了给您一个背景,以下是一些最有影响力的基于 CNN 的架构,其中大部分已经在强大的 GPU 上接受了 DAYS 的训练。
LeNet-5 - 60,000 parameters
AlexNet - 60M paramters
VGG-16 - 138M paramters
Inception-v1 - 5M parameters
Inception-v3 - 24M parameters
ResNet-50 - 26M parameters
Xception - 23M parameters
Inception-v4 - 43M parameters
Inception-ResNet-V2 - 56M parameters
ResNeXt-50 - 25M parameters
Your basic 2 CNN stack model - 125M parameters!
您可以执行以下操作 -
flatten (Flatten) (None, 492032) 0
_________________________________________________________________
dropout (Dropout) (None, 492032) 0
_________________________________________________________________
dense (Dense) (None, 256) 125960448 <---!!!!
_________________________________________________________________
您正在将 62x62x128 张量展平为 492,000 长度的向量!相反,要么尝试添加更多 CNN 以带来更易于管理的前 2 个 dims AND/OR 增加以前 CNN 中内核的大小。
这里的目标是在进入 Dense 层之前拥有一个可管理大小的张量。另外,尝试大幅减少密集层中的节点数。
对于初学者尝试这样的事情,你的设备可以在不杀死内核的情况下实际处理的事情,比如使用 68k 参数(你应该变得更简单,以后再增加复杂性。)
model=tf.keras.models.Sequential([
Conv2D(32, 3, activation='relu', input_shape=(500, 500, 1)),
MaxPooling2D(3,3),
Conv2D(64, 3, activation='relu'),
MaxPooling2D(3,3),
Conv2D(128, 3, padding='same', activation='relu'),
MaxPooling2D(3,3),
Conv2D(256, 3, padding='same', activation='relu'),
MaxPooling2D(3,3),
Flatten(),
Dropout(0.5),
Dense(32, activation='relu'),
Dense(2, activation='softmax') # dense layer has a shape of 2 as we have only 2 classes
])
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_19 (Conv2D) (None, 498, 498, 32) 320
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 166, 166, 32) 0
_________________________________________________________________
conv2d_20 (Conv2D) (None, 164, 164, 64) 18496
_________________________________________________________________
max_pooling2d_19 (MaxPooling (None, 54, 54, 64) 0
_________________________________________________________________
conv2d_21 (Conv2D) (None, 54, 54, 128) 73856
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 18, 18, 128) 0
_________________________________________________________________
conv2d_22 (Conv2D) (None, 18, 18, 256) 295168
_________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 6, 6, 256) 0
_________________________________________________________________
flatten_5 (Flatten) (None, 9216) 0
_________________________________________________________________
dropout_5 (Dropout) (None, 9216) 0
_________________________________________________________________
dense_10 (Dense) (None, 32) 294944
_________________________________________________________________
dense_11 (Dense) (None, 2) 66
=================================================================
Total params: 682,850
Trainable params: 682,850
Non-trainable params: 0
_________________________________________________________________