如何将权重从 2D convnet 复制到 Keras 上的 3D Convnet？

Question

我正在尝试在带有 Tensorflow 后端的 Keras 上实现一个 3D 卷积网络，然后是 LSTM 层以使用 3D 图像作为输入来生成序列。

我想开始使用现有预训练模型的权重进行训练，以避免随机初始化的常见问题。

为了从一个基本示例开始，我采用了 VGG-16 并实现了该网络的“3D”版本（没有 FC 层）：

img_input = Input((100,80,80,3))
x = Conv3D(64, (3, 3 ,3), activation='relu', padding='same', name='block1_conv1')(img_input)

x = Conv3D(64, (3, 3 ,3), activation='relu', padding='same', name='block1_conv2')(x)

x = MaxPooling3D((1, 2, 2), strides=(1, 2, 2), name='block1_pool')(x)

x = Conv3D(128, (3, 3 ,3), activation='relu', padding='same', name='block2_conv1')(x)

x = Conv3D(128, (3, 3 ,3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling3D((1, 2 ,2), strides=(1,2, 2), name='block2_pool')(x)

x = Conv3D(256, (3, 3 ,3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv3D(256, (3, 3 , 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv3D(256, (3, 3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling3D((1, 2 ,2), strides=(1,2, 2), name='block3_pool')(x)

x = Conv3D(512, (3, 3 ,3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv3D(512, (3, 3 ,3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv3D(512, (3, 3 ,3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling3D((1, 2 ,2), strides=(1, 2, 2), name='block4_pool')(x)

x = Conv3D(512, (3, 3 ,3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv3D(512, (3, 3 ,3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv3D(512, (3, 3 ,3), activation='relu', padding='same', name='block5_conv3')(x)
x = MaxPooling3D((1, 2 ,2), strides=(1, 2, 2), name='block5_pool')(x)

所以我想知道如何将预训练的 VGG-16 的权重加载到 100 个切片中的每个切片中（我的 3D 图像由 100 个 80x80 rgb 切片组成），

你能给我的任何建议都会很有用，

谢谢

Answer 1

这取决于您希望在您的应用程序中做什么。如果您只是想根据切片处理 3D 图像，那么您可以定义一个 TimeDistributed VGG16 网络（Conv2D 而不是 Conv3D）。

对于您在上面定义的每一层，模型都会变成这样：

img_input = Input((100,80,80,3))
x = TimeDistributed(Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1', trainable=False))(img_input)
x = TimeDistributed(Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2', trainable=False))(x)
x = TimeDistributed((MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', trainable=False)(x)
...
...

请注意，我在此处包含了选项 'trainable=False'。如果您只想训练较深的层并使用训练有素的 VGG 来冻结较低的层，这将非常有用。

要加载模型的 VGG 权重，您可以使用 Keras 的 load_weights 函数。

model.load_weights(filepath, by_name=True)

如果您将不想训练的层名称设置为与VGG16中定义的名称相同，那么您只需在此处按名称加载这些层即可。

但是，时空特征学习可以通过使用 3D ConvNets 做得更好。如果这是您应用程序的基础，那么您不能直接将 VGG16 权重导入 Conv3D 模型，因为现在每层中的参数数量增加了，因为过滤器从 3*3 变为 3*3*3例如。

您仍然可以通过考虑 3*3*3 中的哪个 3*3 补丁最适合使用 VGG16 权重进行初始化，将权重逐层加载到模型中。 set_weights() 函数将 numpy 数组列表作为输入（分别用于内核权重和偏差）。您可以从 VGG16 中提取每个层的权重，然后为等效的 Conv3D 权重矩阵构造一个新的 numpy 数组并将其提供给您的 Conv3D 模型。

但我鼓励您查看现有的用于处理 3D 图像的文献和模型，看看它们是否可以使用迁移学习为您提供更好的初始化。

例如，C3D is one such popular model. ShapeNet and Pascal3D 是流行的 3D 数据集。

关于如何处理视频数据也可能有助于让您更好地了解如何继续。

如何将权重从 2D convnet 复制到 Keras 上的 3D Convnet？

How to copy weights from a 2D convnet in to a 3D Convnet on Keras?

python-3.x

conv-neural-network

keras

tensorflow

transfer-learning