使用自定义 Classes/Functions 保存和加载 Tensorflow/Keras 模型时出错

Error Saving & Loading Tensorflow/Keras Model With Custom Classes/Functions

我最近使用 Keras Transformers 创建了一个 Tensorflow/Keras 模型。为此,创建了自定义 PositionalEmbedding 和 TransformerEncoder 类 并用于构建模型架构。有这样创建的:

class PositionalEmbedding(layers.Layer):
    def __init__(self, sequence_length, output_dim, **kwargs):
        super().__init__(**kwargs)
        self.position_embeddings = layers.Embedding(
            input_dim=sequence_length, output_dim=output_dim
        )
        self.sequence_length = sequence_length
        self.output_dim = output_dim

    def call(self, inputs):
        # The inputs are of shape: `(batch_size, frames, num_features)`
        length = tf.shape(inputs)[1]
        positions = tf.range(start=0, limit=length, delta=1)
        embedded_positions = self.position_embeddings(positions)
        return inputs + embedded_positions

    def compute_mask(self, inputs, mask=None):
        mask = tf.reduce_any(tf.cast(inputs, "bool"), axis=-1)
        return mask

class TransformerEncoder(layers.Layer):
    def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):
        super().__init__(**kwargs)
        self.embed_dim = embed_dim
        self.dense_dim = dense_dim
        self.num_heads = num_heads
        self.attention = layers.MultiHeadAttention(
            num_heads=num_heads, key_dim=embed_dim, dropout=0.3
        )
        self.dense_proj = keras.Sequential(
            [layers.Dense(dense_dim, activation=tf.nn.gelu), layers.Dense(embed_dim),]
        )
        self.layernorm_1 = layers.LayerNormalization()
        self.layernorm_2 = layers.LayerNormalization()

    def call(self, inputs, mask=None):
        if mask is not None:
            mask = mask[:, tf.newaxis, :]

        attention_output = self.attention(inputs, inputs, attention_mask=mask)
        proj_input = self.layernorm_1(inputs + attention_output)
        proj_output = self.dense_proj(proj_input)
        return self.layernorm_2(proj_input + proj_output)

起初,我什至无法使用典型的 model.save() 方法保存此模型。但是,我能够通过像这样更新 类 的配置来解决这个问题:

### FOR THE PositionalEmbedding CLASS
def get_config(self):
 
        config = super().get_config().copy()
        config.update({
            'position_embeddings': self.position_embeddings,
            'sequence_length': self.sequence_length,
            'output_dim': self.output_dim
        })
        return config
 
### FOR THE TransformerEncoder CLASS
def get_config(self):
 
        config = super().get_config().copy()
        config.update({
            'embed_dim': self.embed_dim,
            'dense_dim': self.dense_dim,
            'num_heads': self.num_heads,
            'attention': self.attention,
            'dense_proj': self.dense_proj,
            'layernorm_1': self.layernorm_1,
            'layernorm_2': self.layernorm_2
        })
        return config

但是,当我尝试使用不带 custom_objects 参数的 keras load_model() 方法加载模型时,出现以下错误:

ValueError: Unknown layer: PositionalEmbedding. Please ensure this object is passed to the `custom_objects` argument.

如果我使用 load _model() 方法而不初始化 类,对两个 类 使用 custom_objects 参数 load_model('my_model.h5', custom_objects= {'PositionalEmbedding':PositionalEmbedding,'TransformerEncoder':TransformerEncoder}),我出现以下错误:

NameError: name 'PositionalEmbedding' is not defined

最后,如果我在加载前使用更新后的配置初始化 类,并使用前面示例中所示的 load_model() 方法,我会收到以下错误:

TypeError: ('Keyword argument not understood:', 'position_embeddings')

任何人都知道可能导致此问题的原因以及我如何解决它们以加载此模型?感谢您的帮助!

谢谢!

山姆

所以我实际上能够通过变通方法解决这个问题。我没有保存模型并以 old-fashioned 方式加载它,而是在训练时为模型保存了一个检查点,然后通过从头开始创建一个新模型并将检查点作为权重加载来加载它。

代码如下:

### SAVING THE MODEL WITH CHECKPOINT
filepath = "/content/drive/MyDrive/tmp/model_checkpoint.ckpt"
checkpoint = keras.callbacks.ModelCheckpoint(
    filepath, save_weights_only=True, save_best_only=True, verbose=1
)

history = model.fit(
    train_data,
    train_labels,
    validation_split=0.3,
    epochs=250,
    batch_size=256,
    callbacks=[checkpoint],
)

### CREATING NEW MODEL & LOADING CHECKPOINT AS WEIGHTS
def get_compiled_model():
    sequence_length = MAX_SEQ_LENGTH
    embed_dim = NUM_FEATURES
    dense_dim = 4
    num_heads = 1
    classes = len(label_processor.get_vocabulary())

    inputs = keras.Input(shape=(None, None))
    x = PositionalEmbedding(
        sequence_length, embed_dim, name="frame_position_embedding"
    )(inputs)
    x = TransformerEncoder(embed_dim, dense_dim, num_heads, name="transformer_layer")(x)
    x = layers.GlobalMaxPooling1D()(x)
    x = layers.Dropout(0.5)(x)
    outputs = layers.Dense(classes, activation="softmax")(x)
    model = keras.Model(inputs, outputs)

    model.compile(
        optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
    )
    return model

model = get_compiled_model()

model.load_weights("/content/drive/MyDrive/tmp/model_checkpoint.ckpt")