在普通编码器-解码器架构中实现注意力

Implemenet attention in vanilla encoder-decoder architecture

我尝试了如下的 vanila enc-dec arch（英语到法语 NMT）

I want to know how to integrate keras attention layer here. Either from the keras docs or any other attention module from third party repo is also welcome. I just need to integrate it and see how it works and finetune it.

Full code is available here.

此 post 中没有显示任何代码，因为它又大又复杂。

终于解决了这个问题。我正在使用 third-party-attention layer by Thushan Ganegedara。使用它的 Attentionlayer class。并将其集成到我的架构中，如下所示。