Keras VGG16 微调

Question

keras blog上有一个 VGG16 微调的例子，但我无法重现。

更准确地说，这里是用于初始化没有顶层的 VGG16 并冻结除最顶层以外的所有块的代码：

WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
weights_path = get_file('vgg16_weights.h5', WEIGHTS_PATH_NO_TOP)

model = Sequential()
model.add(InputLayer(input_shape=(150, 150, 3)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1'))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2'))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2), name='block5_maxpool'))

model.load_weights(weights_path)

for layer in model.layers:
    layer.trainable = False

for layer in model.layers[-4:]:
    layer.trainable = True
    print("Layer '%s' is trainable" % layer.name)

接下来，创建单隐藏层顶层模型：

top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))
top_model.load_weights('top_model.h5')

请注意，它之前在博客 post 中描述的瓶颈特征上进行过训练。接下来，将这个顶层模型添加到基础模型中并编译：

model.add(top_model)
model.compile(loss='binary_crossentropy',
              optimizer=SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

最终，适合 cats/dogs 数据：

batch_size = 16

train_datagen = ImageDataGenerator(rescale=1./255,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

train_gen = train_datagen.flow_from_directory(
    TRAIN_DIR,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='binary')

valid_gen = test_datagen.flow_from_directory(
    VALID_DIR,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='binary')

model.fit_generator(
    train_gen,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=nb_epoch,
    validation_data=valid_gen,
    validation_steps=nb_valid_samples // batch_size)

但这是我在尝试拟合时遇到的错误：

ValueError: Error when checking model target: expected block5_maxpool to have 4 > dimensions, but got array with shape (16, 1)

因此，基础模型中的最后一个池化层似乎有问题。或者可能我在尝试将基本模型与顶级模型连接时做错了什么。

有没有人有类似的问题？或者也许有更好的方法来构建这样的 "concatenated" 模型？我正在使用 keras==2.0.0 和 theano 后端。

Note: I was using examples from gist and applications.VGG16 utility, but has issues trying to concatenate models, I am not too familiar with keras functional API. So this solution I provide here is the most "successful" one, i.e. it fails only on fitting stage.

更新#1

好的，这是关于我正在尝试做的事情的一个小解释。首先，我从 VGG16 生成瓶颈特征如下：

def save_bottleneck_features():
    datagen = ImageDataGenerator(rescale=1./255)
    model = applications.VGG16(include_top=False, weights='imagenet')

    generator = datagen.flow_from_directory(
        TRAIN_DIR,
        target_size=(150, 150),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)    
    print("Predicting train samples..")
    bottleneck_features_train = model.predict_generator(generator, nb_train_samples)
    np.save(open('bottleneck_features_train.npy', 'w'), bottleneck_features_train)

    generator = datagen.flow_from_directory(
        VALID_DIR,
        target_size=(150, 150),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)
    print("Predicting valid samples..")
    bottleneck_features_valid = model.predict_generator(generator, nb_valid_samples)
    np.save(open('bottleneck_features_valid.npy', 'w'), bottleneck_features_valid)

然后，我创建了一个顶级模型并按如下方式对这些特征进行训练：

def train_top_model():
    train_data = np.load(open('bottleneck_features_train.npy'))
    train_labels = np.array([0]*(nb_train_samples / 2) + 
                            [1]*(nb_train_samples / 2))
    valid_data = np.load(open('bottleneck_features_valid.npy'))
    valid_labels = np.array([0]*(nb_valid_samples / 2) +
                            [1]*(nb_valid_samples / 2))
    model = Sequential()
    model.add(Flatten(input_shape=train_data.shape[1:]))  
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
    model.fit(train_data, train_labels,
              nb_epoch=nb_epoch,
              batch_size=batch_size,
              validation_data=(valid_data, valid_labels),
              verbose=1)
    model.save_weights('top_model.h5')

基本上，有两个经过训练的模型，base_model 具有 ImageNet 权重，top_model 具有从瓶颈特征生成的权重。我想知道如何连接它们？有可能还是我做错了什么？因为正如我所见，来自@thomas-pinetz 的响应假设顶级模型 没有单独训练并立即附加到模型 。不确定我是否清楚，这里引用博客：

In order to perform fine-tuning, all layers should start with properly trained weights: for instance you should not slap a randomly initialized fully-connected network on top of a pre-trained convolutional base. This is because the large gradient updates triggered by the randomly initialized weights would wreck the learned weights in the convolutional base. In our case this is why we first train the top-level classifier, and only then start fine-tuning convolutional weights alongside it.

Answer 1

我认为 vgg 网络描述的权重不适合您的模型，错误源于此。无论如何，如 (https://keras.io/applications/#vgg16).

中所述，使用网络本身有更好的方法来做到这一点

您可以只使用：

base_model = keras.applications.vgg16.VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=None)

实例化预训练的 vgg 网络。然后你可以冻结图层并使用模型 class 像这样实例化你自己的模型：

x = base_model.output
x = Flatten()(x)
x = Dense(your_classes, activation='softmax')(x) #minor edit
new_model = Model(input=base_model.input, output=x)

要结合底部网络和顶部网络，您可以使用以下代码片段。使用了以下函数（Input Layer(https://keras.io/getting-started/functional-api-guide/) / load_model (https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model)和keras的functionalAPI）：

final_input = Input(shape=(3, 224, 224))
base_model = vgg...
top_model = load_model(weights_file)

x = base_model(final_input)
result = top_model(x)
final_model = Model(input=final_input, output=result)

Answer 2

我认为您可以通过执行以下操作将两者连接起来：

#load vgg model
vgg_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150, 150, 3))
print('Model loaded.')

#initialise top model
top_model = Sequential()
top_model.add(Flatten(input_shape=vgg_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))


top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base

model = Model(input= vgg_model.input, output= top_model(vgg_model.output))

此解决方案参考示例Fine-tuning the top layers of a a pre-trained network. Full code can be found here。

Answer 3

好的，我想 Thomas 和 Gowtham 发布的是正确的（而且答案更简洁），但我想分享代码，我能够运行成功：

def train_finetuned_model(lr=1e-5, verbose=True):
    file_path = get_file('vgg16.h5', VGG16_WEIGHTS_PATH, cache_subdir='models')
    if verbose:
        print('Building VGG16 (no-top) model to generate bottleneck features.')

    vgg16_notop = build_vgg_16()
    vgg16_notop.load_weights(file_path)
    for _ in range(6):
        vgg16_notop.pop()
    vgg16_notop.compile(optimizer=RMSprop(lr=lr), loss='categorical_crossentropy', metrics=['accuracy'])    

    if verbose:
        print('Bottleneck features generation.')

    train_batches = get_batches('train', shuffle=False, class_mode=None, batch_size=BATCH_SIZE)
    train_labels = np.array([0]*1000 + [1]*1000)
    train_bottleneck = vgg16_notop.predict_generator(train_batches, steps=2000 // BATCH_SIZE)
    valid_batches = get_batches('valid', shuffle=False, class_mode=None, batch_size=BATCH_SIZE)
    valid_labels = np.array([0]*400 + [1]*400)
    valid_bottleneck = vgg16_notop.predict_generator(valid_batches, steps=800 // BATCH_SIZE)

    if verbose:
        print('Training top model on bottleneck features.')

    top_model = Sequential()
    top_model.add(Flatten(input_shape=train_bottleneck.shape[1:]))
    top_model.add(Dense(4096, activation='relu'))
    top_model.add(Dropout(0.5))
    top_model.add(Dense(4096, activation='relu'))
    top_model.add(Dropout(0.5))
    top_model.add(Dense(2, activation='softmax'))
    top_model.compile(optimizer=RMSprop(lr=lr), loss='categorical_crossentropy', metrics=['accuracy'])
    top_model.fit(train_bottleneck, to_categorical(train_labels),
                  batch_size=32, epochs=10,
                  validation_data=(valid_bottleneck, to_categorical(valid_labels)))

    if verbose:
        print('Concatenate new VGG16 (without top layer) with pretrained top model.')

    vgg16_fine = build_vgg_16()
    vgg16_fine.load_weights(file_path)
    for _ in range(6):
        vgg16_fine.pop()
    vgg16_fine.add(Flatten(name='top_flatten'))    
    vgg16_fine.add(Dense(4096, activation='relu'))
    vgg16_fine.add(Dropout(0.5))
    vgg16_fine.add(Dense(4096, activation='relu'))
    vgg16_fine.add(Dropout(0.5))
    vgg16_fine.add(Dense(2, activation='softmax'))
    vgg16_fine.compile(optimizer=RMSprop(lr=lr), loss='categorical_crossentropy', metrics=['accuracy'])

    if verbose:
        print('Loading pre-trained weights into concatenated model')

    for i, layer in enumerate(reversed(top_model.layers), 1):
        pretrained_weights = layer.get_weights()
        vgg16_fine.layers[-i].set_weights(pretrained_weights)

    for layer in vgg16_fine.layers[:26]:
        layer.trainable = False

    if verbose:
        print('Layers training status:')
        for layer in vgg16_fine.layers:
            print('[%6s] %s' % ('' if layer.trainable else 'FROZEN', layer.name))        

    vgg16_fine.compile(optimizer=RMSprop(lr=1e-6), loss='binary_crossentropy', metrics=['accuracy'])

    if verbose:
        print('Train concatenated model on dogs/cats dataset sample.')

    train_datagen = ImageDataGenerator(rescale=1./255,
                                       shear_range=0.2,
                                       zoom_range=0.2,
                                       horizontal_flip=True)
    test_datagen = ImageDataGenerator(rescale=1./255)
    train_batches = get_batches('train', gen=train_datagen, class_mode='categorical', batch_size=BATCH_SIZE)
    valid_batches = get_batches('valid', gen=test_datagen, class_mode='categorical', batch_size=BATCH_SIZE)
    vgg16_fine.fit_generator(train_batches, epochs=100,
                             steps_per_epoch=2000 // BATCH_SIZE,
                             validation_data=valid_batches,
                             validation_steps=800 // BATCH_SIZE)
    return vgg16_fine

它有点过于冗长，所有事情都是手动完成的（即将权重从预训练层复制到连接模型），但它或多或少是有效的。

虽然我发布的这段代码存在准确率低（大约 70%）的问题，但那是另外一回事了。

Keras VGG16 微调

Keras VGG16 fine tuning

python

neural-network

deep-learning

keras

vgg-net

更新#1