我可以将 VGG16 用于单通道图像吗?
Can I use VGG16 for one channel images?
我刚刚开始学习 Tensorflow (2.1.0)、Keras (2.3.7) Python 3.7.7。
我想使用 VGG16 网络对黑白图像 (200x200x1) 进行语义分割。
我用过这个网络,原来input_size
是(224,224,3)
:
def vgg16_encoder_decoder(input_size = (200,200,1)):
#################################
# Encoder
#################################
inputs = Input(input_size, name = 'input')
conv1 = Conv2D(64, (3, 3), activation = 'relu', padding = 'same', name ='conv1_1')(inputs)
conv1 = Conv2D(64, (3, 3), activation = 'relu', padding = 'same', name ='conv1_2')(conv1)
pool1 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_1')(conv1)
conv2 = Conv2D(128, (3, 3), activation = 'relu', padding = 'same', name ='conv2_1')(pool1)
conv2 = Conv2D(128, (3, 3), activation = 'relu', padding = 'same', name ='conv2_2')(conv2)
pool2 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_2')(conv2)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_1')(pool2)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_2')(conv3)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_3')(conv3)
pool3 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_3')(conv3)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_1')(pool3)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_2')(conv4)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_3')(conv4)
pool4 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_4')(conv4)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_1')(pool4)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_2')(conv5)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_3')(conv5)
pool5 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_5')(conv5)
#################################
# Decoder
#################################
#conv1 = Conv2DTranspose(512, (2, 2), strides = 2, name = 'conv1')(pool5)
upsp1 = UpSampling2D(size = (2,2), name = 'upsp1')(pool5)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_1')(upsp1)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_2')(conv6)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_3')(conv6)
upsp2 = UpSampling2D(size = (2,2), name = 'upsp2')(conv6)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_1')(upsp2)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_2')(conv7)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_3')(conv7)
upsp3 = UpSampling2D(size = (2,2), name = 'upsp3')(conv7)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_1')(upsp3)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_2')(conv8)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_3')(conv8)
upsp4 = UpSampling2D(size = (2,2), name = 'upsp4')(conv8)
conv9 = Conv2D(128, 3, activation = 'relu', padding = 'same', name = 'conv9_1')(upsp4)
conv9 = Conv2D(128, 3, activation = 'relu', padding = 'same', name = 'conv9_2')(conv9)
upsp5 = UpSampling2D(size = (2,2), name = 'upsp5')(conv9)
conv10 = Conv2D(64, 3, activation = 'relu', padding = 'same', name = 'conv10_1')(upsp5)
conv10 = Conv2D(64, 3, activation = 'relu', padding = 'same', name = 'conv10_2')(conv10)
conv11 = Conv2D(3, 3, activation = 'relu', padding = 'same', name = 'conv11')(conv10)
model = Model(inputs = inputs, outputs = conv11, name = 'vgg-16_encoder_decoder')
return model
模型摘要:
Model: "vgg-16_encoder_decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 200, 200, 1) 0
_________________________________________________________________
conv1_1 (Conv2D) (None, 200, 200, 64) 640
_________________________________________________________________
conv1_2 (Conv2D) (None, 200, 200, 64) 36928
_________________________________________________________________
pool_1 (MaxPooling2D) (None, 100, 100, 64) 0
_________________________________________________________________
conv2_1 (Conv2D) (None, 100, 100, 128) 73856
_________________________________________________________________
conv2_2 (Conv2D) (None, 100, 100, 128) 147584
_________________________________________________________________
pool_2 (MaxPooling2D) (None, 50, 50, 128) 0
_________________________________________________________________
conv3_1 (Conv2D) (None, 50, 50, 256) 295168
_________________________________________________________________
conv3_2 (Conv2D) (None, 50, 50, 256) 590080
_________________________________________________________________
conv3_3 (Conv2D) (None, 50, 50, 256) 590080
_________________________________________________________________
pool_3 (MaxPooling2D) (None, 25, 25, 256) 0
_________________________________________________________________
conv4_1 (Conv2D) (None, 25, 25, 512) 1180160
_________________________________________________________________
conv4_2 (Conv2D) (None, 25, 25, 512) 2359808
_________________________________________________________________
conv4_3 (Conv2D) (None, 25, 25, 512) 2359808
_________________________________________________________________
pool_4 (MaxPooling2D) (None, 12, 12, 512) 0
_________________________________________________________________
conv5_1 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv5_2 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv5_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
pool_5 (MaxPooling2D) (None, 6, 6, 512) 0
_________________________________________________________________
upsp1 (UpSampling2D) (None, 12, 12, 512) 0
_________________________________________________________________
conv6_1 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv6_2 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv6_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
upsp2 (UpSampling2D) (None, 24, 24, 512) 0
_________________________________________________________________
conv7_1 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
conv7_2 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
conv7_3 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
upsp3 (UpSampling2D) (None, 48, 48, 512) 0
_________________________________________________________________
conv8_1 (Conv2D) (None, 48, 48, 256) 1179904
_________________________________________________________________
conv8_2 (Conv2D) (None, 48, 48, 256) 590080
_________________________________________________________________
conv8_3 (Conv2D) (None, 48, 48, 256) 590080
_________________________________________________________________
upsp4 (UpSampling2D) (None, 96, 96, 256) 0
_________________________________________________________________
conv9_1 (Conv2D) (None, 96, 96, 128) 295040
_________________________________________________________________
conv9_2 (Conv2D) (None, 96, 96, 128) 147584
_________________________________________________________________
upsp5 (UpSampling2D) (None, 192, 192, 128) 0
_________________________________________________________________
conv10_1 (Conv2D) (None, 192, 192, 64) 73792
_________________________________________________________________
conv10_2 (Conv2D) (None, 192, 192, 64) 36928
_________________________________________________________________
conv11 (Conv2D) (None, 192, 192, 3) 1731
=================================================================
Total params: 31,787,523
Trainable params: 31,787,523
Non-trainable params: 0
_________________________________________________________________
最后一个卷积层 return 的形状为 (192, 192, 3)
但我需要 return 形状为 (200, 200, 1)
的图像。
我想我可以用这个改变最后一个卷积层来得到一个单通道图像:
conv11 = Conv2D(1, 3, activation = 'relu', padding = 'same', name = 'conv11')(conv10)
但我不知道这是否正确,因为我一直在阅读有关 VGG16
网络的内容,它适用于 3 通道图像。
我可以将 VGG16 用于单通道图像吗?
你读到的关于 VGG 适用于三通道 (RGB) 图像的内容仅适用于预训练模型,该模型在 ImageNet 数据集上训练并且仅包含彩色图像。由于您没有使用预训练模型,因此不受此限制。
因此您可以使用一个、三个或任意数量的输入或输出通道。
我刚刚开始学习 Tensorflow (2.1.0)、Keras (2.3.7) Python 3.7.7。
我想使用 VGG16 网络对黑白图像 (200x200x1) 进行语义分割。
我用过这个网络,原来input_size
是(224,224,3)
:
def vgg16_encoder_decoder(input_size = (200,200,1)):
#################################
# Encoder
#################################
inputs = Input(input_size, name = 'input')
conv1 = Conv2D(64, (3, 3), activation = 'relu', padding = 'same', name ='conv1_1')(inputs)
conv1 = Conv2D(64, (3, 3), activation = 'relu', padding = 'same', name ='conv1_2')(conv1)
pool1 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_1')(conv1)
conv2 = Conv2D(128, (3, 3), activation = 'relu', padding = 'same', name ='conv2_1')(pool1)
conv2 = Conv2D(128, (3, 3), activation = 'relu', padding = 'same', name ='conv2_2')(conv2)
pool2 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_2')(conv2)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_1')(pool2)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_2')(conv3)
conv3 = Conv2D(256, (3, 3), activation = 'relu', padding = 'same', name ='conv3_3')(conv3)
pool3 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_3')(conv3)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_1')(pool3)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_2')(conv4)
conv4 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv4_3')(conv4)
pool4 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_4')(conv4)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_1')(pool4)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_2')(conv5)
conv5 = Conv2D(512, (3, 3), activation = 'relu', padding = 'same', name ='conv5_3')(conv5)
pool5 = MaxPooling2D(pool_size = (2,2), strides = (2,2), name = 'pool_5')(conv5)
#################################
# Decoder
#################################
#conv1 = Conv2DTranspose(512, (2, 2), strides = 2, name = 'conv1')(pool5)
upsp1 = UpSampling2D(size = (2,2), name = 'upsp1')(pool5)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_1')(upsp1)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_2')(conv6)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv6_3')(conv6)
upsp2 = UpSampling2D(size = (2,2), name = 'upsp2')(conv6)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_1')(upsp2)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_2')(conv7)
conv7 = Conv2D(512, 3, activation = 'relu', padding = 'same', name = 'conv7_3')(conv7)
upsp3 = UpSampling2D(size = (2,2), name = 'upsp3')(conv7)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_1')(upsp3)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_2')(conv8)
conv8 = Conv2D(256, 3, activation = 'relu', padding = 'same', name = 'conv8_3')(conv8)
upsp4 = UpSampling2D(size = (2,2), name = 'upsp4')(conv8)
conv9 = Conv2D(128, 3, activation = 'relu', padding = 'same', name = 'conv9_1')(upsp4)
conv9 = Conv2D(128, 3, activation = 'relu', padding = 'same', name = 'conv9_2')(conv9)
upsp5 = UpSampling2D(size = (2,2), name = 'upsp5')(conv9)
conv10 = Conv2D(64, 3, activation = 'relu', padding = 'same', name = 'conv10_1')(upsp5)
conv10 = Conv2D(64, 3, activation = 'relu', padding = 'same', name = 'conv10_2')(conv10)
conv11 = Conv2D(3, 3, activation = 'relu', padding = 'same', name = 'conv11')(conv10)
model = Model(inputs = inputs, outputs = conv11, name = 'vgg-16_encoder_decoder')
return model
模型摘要:
Model: "vgg-16_encoder_decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) (None, 200, 200, 1) 0
_________________________________________________________________
conv1_1 (Conv2D) (None, 200, 200, 64) 640
_________________________________________________________________
conv1_2 (Conv2D) (None, 200, 200, 64) 36928
_________________________________________________________________
pool_1 (MaxPooling2D) (None, 100, 100, 64) 0
_________________________________________________________________
conv2_1 (Conv2D) (None, 100, 100, 128) 73856
_________________________________________________________________
conv2_2 (Conv2D) (None, 100, 100, 128) 147584
_________________________________________________________________
pool_2 (MaxPooling2D) (None, 50, 50, 128) 0
_________________________________________________________________
conv3_1 (Conv2D) (None, 50, 50, 256) 295168
_________________________________________________________________
conv3_2 (Conv2D) (None, 50, 50, 256) 590080
_________________________________________________________________
conv3_3 (Conv2D) (None, 50, 50, 256) 590080
_________________________________________________________________
pool_3 (MaxPooling2D) (None, 25, 25, 256) 0
_________________________________________________________________
conv4_1 (Conv2D) (None, 25, 25, 512) 1180160
_________________________________________________________________
conv4_2 (Conv2D) (None, 25, 25, 512) 2359808
_________________________________________________________________
conv4_3 (Conv2D) (None, 25, 25, 512) 2359808
_________________________________________________________________
pool_4 (MaxPooling2D) (None, 12, 12, 512) 0
_________________________________________________________________
conv5_1 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv5_2 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv5_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
pool_5 (MaxPooling2D) (None, 6, 6, 512) 0
_________________________________________________________________
upsp1 (UpSampling2D) (None, 12, 12, 512) 0
_________________________________________________________________
conv6_1 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv6_2 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
conv6_3 (Conv2D) (None, 12, 12, 512) 2359808
_________________________________________________________________
upsp2 (UpSampling2D) (None, 24, 24, 512) 0
_________________________________________________________________
conv7_1 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
conv7_2 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
conv7_3 (Conv2D) (None, 24, 24, 512) 2359808
_________________________________________________________________
upsp3 (UpSampling2D) (None, 48, 48, 512) 0
_________________________________________________________________
conv8_1 (Conv2D) (None, 48, 48, 256) 1179904
_________________________________________________________________
conv8_2 (Conv2D) (None, 48, 48, 256) 590080
_________________________________________________________________
conv8_3 (Conv2D) (None, 48, 48, 256) 590080
_________________________________________________________________
upsp4 (UpSampling2D) (None, 96, 96, 256) 0
_________________________________________________________________
conv9_1 (Conv2D) (None, 96, 96, 128) 295040
_________________________________________________________________
conv9_2 (Conv2D) (None, 96, 96, 128) 147584
_________________________________________________________________
upsp5 (UpSampling2D) (None, 192, 192, 128) 0
_________________________________________________________________
conv10_1 (Conv2D) (None, 192, 192, 64) 73792
_________________________________________________________________
conv10_2 (Conv2D) (None, 192, 192, 64) 36928
_________________________________________________________________
conv11 (Conv2D) (None, 192, 192, 3) 1731
=================================================================
Total params: 31,787,523
Trainable params: 31,787,523
Non-trainable params: 0
_________________________________________________________________
最后一个卷积层 return 的形状为 (192, 192, 3)
但我需要 return 形状为 (200, 200, 1)
的图像。
我想我可以用这个改变最后一个卷积层来得到一个单通道图像:
conv11 = Conv2D(1, 3, activation = 'relu', padding = 'same', name = 'conv11')(conv10)
但我不知道这是否正确,因为我一直在阅读有关 VGG16
网络的内容,它适用于 3 通道图像。
我可以将 VGG16 用于单通道图像吗?
你读到的关于 VGG 适用于三通道 (RGB) 图像的内容仅适用于预训练模型,该模型在 ImageNet 数据集上训练并且仅包含彩色图像。由于您没有使用预训练模型,因此不受此限制。
因此您可以使用一个、三个或任意数量的输入或输出通道。