具有其他输入形状和 Imagenet 权重的 VGG16

Question

我是 VGG16 等模型的新手。我一直在搜索有关该模型的信息，但我仍然对此表示怀疑。我有 10000 张不同大小的图像来训练模型 (2 类)，因此由于计算限制我决定使用 86x86 的图像大小，并且它接近每个图像大小的平均值。所以我这样做了：

base_model16 = VGG16(weights='imagenet', include_top=False, input_shape=(86,86,3))

对于发电机：

datagen = ImageDataGenerator(preprocessing_function=preprocess_vgg16) 

train_generator = datagen.flow_from_directory(path_train,
                                                    target_size=(86,86),
                                                    color_mode='rgb',
                                                    batch_size = 128,
                                                    class_mode='categorical',
                                                    shuffle=True)

我读到 VGG16 是用 224x224 训练的，我知道我们可以使用其他尺寸，但是有人可以确认我做的是否正确吗？因为我使用的是 imagenet 权重和 preprocess_vgg16，而且它是 224x224。抱歉，如果有人之前已经问过这个问题，但我需要帮助理解它。

谢谢。

Answer 1

您必须修改 Vgg 模型，因为它旨在对 1000 张图像进行分类。设置 include_top=False 移除模型的顶层，该层有 1000 个神经元。现在我们需要包含一个包含 2 个神经元的层。下面的代码将实现这一点。请注意，在 VGG 模型的参数中，我设置了 pooling='max'。这导致 Vgg 模型的输出是一个向量，可以用作密集层的输入。

base_model=tf.keras.applications.VGG16( include_top=False, input_shape=(86,86,3), 
                                        pooling='max', weights='imagenet' ) 
x=base_model.output
output=Dense(2, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)
model.compile(Adam(lr=.001), loss='categorical_crossentropy', metrics=['accuracy')

顺便说一句，我不喜欢使用 VGG16。它有大约 4000 万个可训练参数，因此计算量大，导致训练时间长。我更喜欢使用 MobileNet 模型，它只有大约 400 万个可训练参数，而且准确度差不多。要使用 MobileNet 模型，只需使用这行代码而不是 Vgg 模型的代码。请注意，我将 image_shape 设置为 (128,128,3)，因为有一个版本的 mobilenet 权重在 imagenet 上训练有 128 X 128 图像，它将自动下载并帮助模型更快收敛。但如果您愿意，也可以使用 86 X86。所以在你的 train_generator 中设置 target_size=(128,128)。同样在 ImageDataGenerator 中，代码 preprocessing_function=preprocess_vgg16 应该仍然适用于 Mobilenet 模型，因为我认为它与 keras.applications.mobilenet.preprocess_input 相同。我相信他们都只是将像素重新调整为 -1 和 +1 之间。

base_model=tf.keras.applications.mobilenet.MobileNet( include_top=False, 
           input_shape=(128,128,3), pooling='max', weights='imagenet',dropout=.4)

具有其他输入形状和 Imagenet 权重的 VGG16

VGG16 with other input shape and Imagenet weights

python

shapes

keras

tensorflow