有没有办法将原生 tf Attention 层与 keras Sequential API 一起使用?

Is there a way to use the native tf Attention layer with keras Sequential API?

有没有办法在 keras Sequential 中使用原生的 tf Attention 层 API?

我想用这个 particular class. I have found custom implementations such as this one。我真正想要的是将这个特定的 class 与顺序 API

一起使用

这是我正在寻找的代码示例

model = tf.keras.models.Sequential()

model.add(tf.keras.layers.Embedding(vocab_length,
                          EMBEDDING_DIM, input_length=MAX_SEQUENCE_LENGTH,
                          weights=[embedding_matrix], trainable=False))

model.add(tf.keras.layers.Dropout(0.3))

model.add(tf.keras.layers.Conv1D(64, 5, activation='relu'))
model.add(tf.keras.layers.MaxPooling1D(pool_size=4))

model.add(tf.keras.layers.CuDNNLSTM(100))
model.add(tf.keras.layers.Dropout(0.4))

model.add(tf.keras.layers.Attention()) # Doesn't work this way

model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

我最终使用了 tsterbak 在 this repository 上找到的自定义 class。这是 AttentionWeightedAverage class。它与顺序 API 兼容 这是我的模型供参考:

model = Sequential()

model.add(Embedding(input_dim=vocab_length,
                    output_dim=EMBEDDING_DIM, input_length=MAX_SEQUENCE_LENGTH,
                    weights=[embedding_matrix], trainable=False))
model.add(Conv1D(64, 5, activation='relu'))
model.add(MaxPooling1D(pool_size=4))

model.add(Bidirectional(GRU(100, return_sequences=True)))

model.add(AttentionWeightedAverage())
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer="adam", metrics=['accuracy'])

注意就是所谓的"soft attention"或"attention with weighted average",如"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention". The details are more understandable here

所述