如何从 Keras 模型中删除前 N 层?
How to remove first N layers from a Keras Model?
我想从预训练的 Keras 模型中删除第一个 N 层。比如一个EfficientNetB0
,它的前3层只负责预处理:
import tensorflow as tf
efinet = tf.keras.applications.EfficientNetB0(weights=None, include_top=True)
print(efinet.layers[:3])
# [<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7fa9a870e4d0>,
# <tensorflow.python.keras.layers.preprocessing.image_preprocessing.Rescaling at 0x7fa9a61343d0>,
# <tensorflow.python.keras.layers.preprocessing.normalization.Normalization at 0x7fa9a60d21d0>]
如 M.Innat 所述,第一层是 Input Layer
,应该保留或重新连接。我想删除这些层,但像这样的简单方法会引发错误:
cut_input_model = return tf.keras.Model(
inputs=[efinet.layers[3].input],
outputs=efinet.outputs
)
这将导致:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(...)
推荐的方法是什么?
出现Graph disconnected
错误的原因是您没有指定Input
层。但这不是这里的主要问题。有时从 keras
模型中移除中间层对于 Sequential
和 Functional
API.
并不简单
对于顺序,它应该比较容易,而在功能模型中,您需要关心多输入块(例如multiply
,add
等)。例如:如果你想在顺序模型中删除一些中间层,你可以很容易地适应 . But for the functional model (efficientnet
), you can't because of the multi-input internal blocks and you will encounter this error: ValueError: A merged layer should be called on a list of inputs
. So that needs a bit more work AFAIK, here is a 来克服它。
在这里,我将针对您的情况展示一个简单的解决方法,但它可能不通用,并且在某些情况下也不安全。那基于 ; using pop
method. Why it can be unsafe to use!。好的,让我们先加载模型。
func_model = tf.keras.applications.EfficientNetB0()
for i, l in enumerate(func_model.layers):
print(l.name, l.output_shape)
if i == 8: break
input_19 [(None, 224, 224, 3)]
rescaling_13 (None, 224, 224, 3)
normalization_13 (None, 224, 224, 3)
stem_conv_pad (None, 225, 225, 3)
stem_conv (None, 112, 112, 32)
stem_bn (None, 112, 112, 32)
stem_activation (None, 112, 112, 32)
block1a_dwconv (None, 112, 112, 32)
block1a_bn (None, 112, 112, 32)
接下来,使用.pop
方法:
func_model._layers.pop(1) # remove rescaling
func_model._layers.pop(1) # remove normalization
for i, l in enumerate(func_model.layers):
print(l.name, l.output_shape)
if i == 8: break
input_22 [(None, 224, 224, 3)]
stem_conv_pad (None, 225, 225, 3)
stem_conv (None, 112, 112, 32)
stem_bn (None, 112, 112, 32)
stem_activation (None, 112, 112, 32)
block1a_dwconv (None, 112, 112, 32)
block1a_bn (None, 112, 112, 32)
block1a_activation (None, 112, 112, 32)
block1a_se_squeeze (None, 32)
我一直在尝试用 keras tensorflow VGGFace 模型做同样的事情。经过大量试验后,我发现这种方法行得通。在这种情况下,除了最后一层外,所有模型都被使用,最后一层被自定义嵌入层替换:
vgg_model = VGGFace(include_top=True, input_shape=(224, 224, 3)) # full VGG16 model
inputs = Input(shape=(224, 224, 3))
x = inputs
# Assemble all layers except for the last layer
for layer in vgg_model.layers[1:-2]:
x = vgg_model.get_layer(layer.name)(x)
# Now add a new last layer that provides the 128 embeddings output
x = Dense(128, activation='softmax', use_bias=False, name='fc8x')(x)
# Create the custom model
custom_vgg_model = Model(inputs, x, name='custom_vggface')
与 layers[x] 或 pop() 不同,get_layer 获取实际层,允许将它们组装成新的输出层集。然后您可以从中创建一个新模型。 'for' 语句以 1 而不是 0 开头,因为输入层已经由 'inputs'.
定义
此方法适用于顺序模型。不清楚它是否适用于更复杂的模型。
对我来说@M.Innat解决方案导致了一个断开的图形,因为仅仅弹出层是不够的,需要在输入层和第一个卷积层之间建立连接(你可以检查问题耐特龙)。
对我有用的唯一正确解决方案是手动编辑模型的配置。
这是一个完整的脚本,删除了 Efficientnet-B1 的预处理部分。使用 TF2 测试。
import tensorflow as tf
def split(model, start, end):
confs = model.get_config()
kept_layers = set()
for i, l in enumerate(confs['layers']):
if i == 0:
confs['layers'][0]['config']['batch_input_shape'] = model.layers[start].input_shape
if i != start:
#confs['layers'][0]['name'] += str(random.randint(0, 100000000)) # rename the input layer to avoid conflicts on merge
confs['layers'][0]['config']['name'] = confs['layers'][0]['name']
elif i < start or i > end:
continue
kept_layers.add(l['name'])
# filter layers
layers = [l for l in confs['layers'] if l['name'] in kept_layers]
layers[1]['inbound_nodes'][0][0][0] = layers[0]['name']
# set conf
confs['layers'] = layers
confs['input_layers'][0][0] = layers[0]['name']
confs['output_layers'][0][0] = layers[-1]['name']
# create new model
submodel = tf.keras.Model.from_config(confs)
for l in submodel.layers:
orig_l = model.get_layer(l.name)
if orig_l is not None:
l.set_weights(orig_l.get_weights())
return submodel
model = tf.keras.applications.efficientnet.EfficientNetB1()
# first layer = 3, last layer = 341
new_model = split(model, 3, 341)
new_model.summary()
new_model.save("efficientnet_b1.h5")
脚本基于此great answer。
我想从预训练的 Keras 模型中删除第一个 N 层。比如一个EfficientNetB0
,它的前3层只负责预处理:
import tensorflow as tf
efinet = tf.keras.applications.EfficientNetB0(weights=None, include_top=True)
print(efinet.layers[:3])
# [<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7fa9a870e4d0>,
# <tensorflow.python.keras.layers.preprocessing.image_preprocessing.Rescaling at 0x7fa9a61343d0>,
# <tensorflow.python.keras.layers.preprocessing.normalization.Normalization at 0x7fa9a60d21d0>]
如 M.Innat 所述,第一层是 Input Layer
,应该保留或重新连接。我想删除这些层,但像这样的简单方法会引发错误:
cut_input_model = return tf.keras.Model(
inputs=[efinet.layers[3].input],
outputs=efinet.outputs
)
这将导致:
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(...)
推荐的方法是什么?
出现Graph disconnected
错误的原因是您没有指定Input
层。但这不是这里的主要问题。有时从 keras
模型中移除中间层对于 Sequential
和 Functional
API.
对于顺序,它应该比较容易,而在功能模型中,您需要关心多输入块(例如multiply
,add
等)。例如:如果你想在顺序模型中删除一些中间层,你可以很容易地适应 efficientnet
), you can't because of the multi-input internal blocks and you will encounter this error: ValueError: A merged layer should be called on a list of inputs
. So that needs a bit more work AFAIK, here is a
在这里,我将针对您的情况展示一个简单的解决方法,但它可能不通用,并且在某些情况下也不安全。那基于 pop
method. Why it can be unsafe to use!。好的,让我们先加载模型。
func_model = tf.keras.applications.EfficientNetB0()
for i, l in enumerate(func_model.layers):
print(l.name, l.output_shape)
if i == 8: break
input_19 [(None, 224, 224, 3)]
rescaling_13 (None, 224, 224, 3)
normalization_13 (None, 224, 224, 3)
stem_conv_pad (None, 225, 225, 3)
stem_conv (None, 112, 112, 32)
stem_bn (None, 112, 112, 32)
stem_activation (None, 112, 112, 32)
block1a_dwconv (None, 112, 112, 32)
block1a_bn (None, 112, 112, 32)
接下来,使用.pop
方法:
func_model._layers.pop(1) # remove rescaling
func_model._layers.pop(1) # remove normalization
for i, l in enumerate(func_model.layers):
print(l.name, l.output_shape)
if i == 8: break
input_22 [(None, 224, 224, 3)]
stem_conv_pad (None, 225, 225, 3)
stem_conv (None, 112, 112, 32)
stem_bn (None, 112, 112, 32)
stem_activation (None, 112, 112, 32)
block1a_dwconv (None, 112, 112, 32)
block1a_bn (None, 112, 112, 32)
block1a_activation (None, 112, 112, 32)
block1a_se_squeeze (None, 32)
我一直在尝试用 keras tensorflow VGGFace 模型做同样的事情。经过大量试验后,我发现这种方法行得通。在这种情况下,除了最后一层外,所有模型都被使用,最后一层被自定义嵌入层替换:
vgg_model = VGGFace(include_top=True, input_shape=(224, 224, 3)) # full VGG16 model
inputs = Input(shape=(224, 224, 3))
x = inputs
# Assemble all layers except for the last layer
for layer in vgg_model.layers[1:-2]:
x = vgg_model.get_layer(layer.name)(x)
# Now add a new last layer that provides the 128 embeddings output
x = Dense(128, activation='softmax', use_bias=False, name='fc8x')(x)
# Create the custom model
custom_vgg_model = Model(inputs, x, name='custom_vggface')
与 layers[x] 或 pop() 不同,get_layer 获取实际层,允许将它们组装成新的输出层集。然后您可以从中创建一个新模型。 'for' 语句以 1 而不是 0 开头,因为输入层已经由 'inputs'.
定义此方法适用于顺序模型。不清楚它是否适用于更复杂的模型。
对我来说@M.Innat解决方案导致了一个断开的图形,因为仅仅弹出层是不够的,需要在输入层和第一个卷积层之间建立连接(你可以检查问题耐特龙)。
对我有用的唯一正确解决方案是手动编辑模型的配置。
这是一个完整的脚本,删除了 Efficientnet-B1 的预处理部分。使用 TF2 测试。
import tensorflow as tf
def split(model, start, end):
confs = model.get_config()
kept_layers = set()
for i, l in enumerate(confs['layers']):
if i == 0:
confs['layers'][0]['config']['batch_input_shape'] = model.layers[start].input_shape
if i != start:
#confs['layers'][0]['name'] += str(random.randint(0, 100000000)) # rename the input layer to avoid conflicts on merge
confs['layers'][0]['config']['name'] = confs['layers'][0]['name']
elif i < start or i > end:
continue
kept_layers.add(l['name'])
# filter layers
layers = [l for l in confs['layers'] if l['name'] in kept_layers]
layers[1]['inbound_nodes'][0][0][0] = layers[0]['name']
# set conf
confs['layers'] = layers
confs['input_layers'][0][0] = layers[0]['name']
confs['output_layers'][0][0] = layers[-1]['name']
# create new model
submodel = tf.keras.Model.from_config(confs)
for l in submodel.layers:
orig_l = model.get_layer(l.name)
if orig_l is not None:
l.set_weights(orig_l.get_weights())
return submodel
model = tf.keras.applications.efficientnet.EfficientNetB1()
# first layer = 3, last layer = 341
new_model = split(model, 3, 341)
new_model.summary()
new_model.save("efficientnet_b1.h5")
脚本基于此great answer。