使用 keras sequential API 提取嵌套层特征(从预训练模型)
Extracting nested layer features (from a pretrained model) with keras sequential API
我有以下用于迁移学习的简单模型,使用预训练模型 (VGG16),没有 FC 层,然后是一些新层,定义为 keras
顺序 API。
IMG_SHAPE = (224, 224, 3)
# vgg16
pretrained_model = tf.keras.applications.vgg16.VGG16(
weights='imagenet',
include_top=False,
input_shape=IMG_SHAPE,
)
# freeze pretrained layers
pretrained_model.trainable = False
model = tf.keras.Sequential([
pretrained_model,
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(3, activation='softmax'),
])
请注意,模型摘要不显示 VGG16
的内部层:
model.summary()
#Model: "sequential"
#_________________________________________________________________
# Layer (type) Output Shape Param #
#=================================================================
# vgg16 (Functional) (None, 4, 4, 512) 14714688
#
# batch_normalization (BatchN (None, 4, 4, 512) 2048
# ormalization)
#
# flatten (Flatten) (None, 8192) 0
# dense (Dense) (None, 2) 16386
#=================================================================
#Total params: 14,733,122
#Trainable params: 17,410
#Non-trainable params: 14,715,712
我已经在我的自定义数据集上训练了上述模型,并通过迁移学习在我的测试数据集上获得了所需的准确性。
现在,假设我想创建一个新模型(例如,计算激活图)接受
输入作为前一个模型的输入,作为输出,我想要一个中间输出(通过提取预训练模型的卷积层的特征,例如 block5_conv3
)以及前一个模型的输出.那是我陷入困境并且遇到错误的地方。例如,我定义了如下新模型:
grad_model = tf.keras.models.Model(
[pretrained_model.inputs],
[pretrained_model.get_layer('block5_conv3').output, model.output]
)
出现以下错误:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 150, 150, 3), dtype=tf.float32, name='vgg16_input'), name='vgg16_input', description="created by layer 'vgg16_input'") at layer "vgg16". The following previous layers were accessed without issue: ['block1_conv1', 'block1_conv2', 'block1_pool', 'block2_conv1', 'block2_conv2', 'block2_pool', 'block3_conv1', 'block3_conv2', 'block3_conv3', 'block3_pool', 'block4_conv1', 'block4_conv2', 'block4_conv3']
或点赞:
grad_model = tf.keras.models.Model(
[model.inputs],
[pretrained_model.get_layer('block5_conv3').output, model.output]
)
出现以下错误:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 150, 150, 3), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'vgg16'") at layer "block1_conv1". The following previous layers were accessed without issue: []
我也试过设置模型的输入层名称和嵌套在里面的预训练模型的名称,使输入层名称相同:
pretrained_model.layers[0]._name = model.layers[0]._name
但出现同样的错误。
我认为可以更改模型结构(例如,使用 keras
函数式 API 等)来定义 grad_model
,但不确定如何更改。另外,我更想知道是否有一种方法可以在不更改模型结构/不需要我重新训练的情况下解决问题。
到目前为止,根据@M.Innat的评论,我可以使用keras
功能API(注意参数数量保持不变)和re-training解决问题:
inputs = tf.keras.Input(shape=IMG_SHAPE)
x = keras.applications.xception.Xception(
input_tensor=inputs,
include_top=False,
weights='imagenet'
)
x.trainable = False
x = tf.keras.layers.BatchNormalization()(x.output)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dense(2, activation='softmax')(x)
model = tf.keras.Model(inputs, x)
model.summary()
# Model: "model_1262"
#_________________________________________________________________
# Layer (type) Output Shape Param #
#=================================================================
# input_3 (InputLayer) [(None, 150, 150, 3)] 0
#
# block1_conv1 (Conv2D) (None, 150, 150, 64) 1792
#
# block1_conv2 (Conv2D) (None, 150, 150, 64) 36928
#
# block1_pool (MaxPooling2D) (None, 75, 75, 64) 0
#
# block2_conv1 (Conv2D) (None, 75, 75, 128) 73856
#
# block2_conv2 (Conv2D) (None, 75, 75, 128) 147584
#
# block2_pool (MaxPooling2D) (None, 37, 37, 128) 0
#
# block3_conv1 (Conv2D) (None, 37, 37, 256) 295168
#
# block3_conv2 (Conv2D) (None, 37, 37, 256) 590080
#
# block3_conv3 (Conv2D) (None, 37, 37, 256) 590080
#
# block3_pool (MaxPooling2D) (None, 18, 18, 256) 0
#
# block4_conv1 (Conv2D) (None, 18, 18, 512) 1180160
#
# block4_conv2 (Conv2D) (None, 18, 18, 512) 2359808
#
# block4_conv3 (Conv2D) (None, 18, 18, 512) 2359808
#
# block4_pool (MaxPooling2D) (None, 9, 9, 512) 0
#
# block5_conv1 (Conv2D) (None, 9, 9, 512) 2359808
#
# block5_conv2 (Conv2D) (None, 9, 9, 512) 2359808
#
# block5_conv3 (Conv2D) (None, 9, 9, 512) 2359808
#
# block5_pool (MaxPooling2D) (None, 4, 4, 512) 0
#
# batch_normalization_9 (Batc (None, 4, 4, 512) 2048
# hNormalization)
#
# flatten_1 (Flatten) (None, 8192) 0
#
# dense_1 (Dense) (None, 2) 16386
#
#=================================================================
#Total params: 14,733,122
#Trainable params: 17,410
#Non-trainable params: 14,715,712
#_________________________________________________________________
以及以下用于提取 class 激活图的中间特征的代码有效:
grad_model = tf.keras.models.Model(
[model.inputs], [model.get_layer('block5_conv3').output, model.output]
)
我有以下用于迁移学习的简单模型,使用预训练模型 (VGG16),没有 FC 层,然后是一些新层,定义为 keras
顺序 API。
IMG_SHAPE = (224, 224, 3)
# vgg16
pretrained_model = tf.keras.applications.vgg16.VGG16(
weights='imagenet',
include_top=False,
input_shape=IMG_SHAPE,
)
# freeze pretrained layers
pretrained_model.trainable = False
model = tf.keras.Sequential([
pretrained_model,
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(3, activation='softmax'),
])
请注意,模型摘要不显示 VGG16
的内部层:
model.summary()
#Model: "sequential"
#_________________________________________________________________
# Layer (type) Output Shape Param #
#=================================================================
# vgg16 (Functional) (None, 4, 4, 512) 14714688
#
# batch_normalization (BatchN (None, 4, 4, 512) 2048
# ormalization)
#
# flatten (Flatten) (None, 8192) 0
# dense (Dense) (None, 2) 16386
#=================================================================
#Total params: 14,733,122
#Trainable params: 17,410
#Non-trainable params: 14,715,712
我已经在我的自定义数据集上训练了上述模型,并通过迁移学习在我的测试数据集上获得了所需的准确性。
现在,假设我想创建一个新模型(例如,计算激活图)接受
输入作为前一个模型的输入,作为输出,我想要一个中间输出(通过提取预训练模型的卷积层的特征,例如 block5_conv3
)以及前一个模型的输出.那是我陷入困境并且遇到错误的地方。例如,我定义了如下新模型:
grad_model = tf.keras.models.Model(
[pretrained_model.inputs],
[pretrained_model.get_layer('block5_conv3').output, model.output]
)
出现以下错误:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 150, 150, 3), dtype=tf.float32, name='vgg16_input'), name='vgg16_input', description="created by layer 'vgg16_input'") at layer "vgg16". The following previous layers were accessed without issue: ['block1_conv1', 'block1_conv2', 'block1_pool', 'block2_conv1', 'block2_conv2', 'block2_pool', 'block3_conv1', 'block3_conv2', 'block3_conv3', 'block3_pool', 'block4_conv1', 'block4_conv2', 'block4_conv3']
或点赞:
grad_model = tf.keras.models.Model(
[model.inputs],
[pretrained_model.get_layer('block5_conv3').output, model.output]
)
出现以下错误:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 150, 150, 3), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'vgg16'") at layer "block1_conv1". The following previous layers were accessed without issue: []
我也试过设置模型的输入层名称和嵌套在里面的预训练模型的名称,使输入层名称相同:
pretrained_model.layers[0]._name = model.layers[0]._name
但出现同样的错误。
我认为可以更改模型结构(例如,使用 keras
函数式 API 等)来定义 grad_model
,但不确定如何更改。另外,我更想知道是否有一种方法可以在不更改模型结构/不需要我重新训练的情况下解决问题。
到目前为止,根据@M.Innat的评论,我可以使用keras
功能API(注意参数数量保持不变)和re-training解决问题:
inputs = tf.keras.Input(shape=IMG_SHAPE)
x = keras.applications.xception.Xception(
input_tensor=inputs,
include_top=False,
weights='imagenet'
)
x.trainable = False
x = tf.keras.layers.BatchNormalization()(x.output)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dense(2, activation='softmax')(x)
model = tf.keras.Model(inputs, x)
model.summary()
# Model: "model_1262"
#_________________________________________________________________
# Layer (type) Output Shape Param #
#=================================================================
# input_3 (InputLayer) [(None, 150, 150, 3)] 0
#
# block1_conv1 (Conv2D) (None, 150, 150, 64) 1792
#
# block1_conv2 (Conv2D) (None, 150, 150, 64) 36928
#
# block1_pool (MaxPooling2D) (None, 75, 75, 64) 0
#
# block2_conv1 (Conv2D) (None, 75, 75, 128) 73856
#
# block2_conv2 (Conv2D) (None, 75, 75, 128) 147584
#
# block2_pool (MaxPooling2D) (None, 37, 37, 128) 0
#
# block3_conv1 (Conv2D) (None, 37, 37, 256) 295168
#
# block3_conv2 (Conv2D) (None, 37, 37, 256) 590080
#
# block3_conv3 (Conv2D) (None, 37, 37, 256) 590080
#
# block3_pool (MaxPooling2D) (None, 18, 18, 256) 0
#
# block4_conv1 (Conv2D) (None, 18, 18, 512) 1180160
#
# block4_conv2 (Conv2D) (None, 18, 18, 512) 2359808
#
# block4_conv3 (Conv2D) (None, 18, 18, 512) 2359808
#
# block4_pool (MaxPooling2D) (None, 9, 9, 512) 0
#
# block5_conv1 (Conv2D) (None, 9, 9, 512) 2359808
#
# block5_conv2 (Conv2D) (None, 9, 9, 512) 2359808
#
# block5_conv3 (Conv2D) (None, 9, 9, 512) 2359808
#
# block5_pool (MaxPooling2D) (None, 4, 4, 512) 0
#
# batch_normalization_9 (Batc (None, 4, 4, 512) 2048
# hNormalization)
#
# flatten_1 (Flatten) (None, 8192) 0
#
# dense_1 (Dense) (None, 2) 16386
#
#=================================================================
#Total params: 14,733,122
#Trainable params: 17,410
#Non-trainable params: 14,715,712
#_________________________________________________________________
以及以下用于提取 class 激活图的中间特征的代码有效:
grad_model = tf.keras.models.Model(
[model.inputs], [model.get_layer('block5_conv3').output, model.output]
)