在普通编码器-解码器架构中实现注意力

Implemenet attention in vanilla encoder-decoder architecture

我尝试了如下的 vanila enc-dec arch(英语到法语 NMT)

I want to know how to integrate keras attention layer here. Either from the keras docs or any other attention module from third party repo is also welcome. I just need to integrate it and see how it works and finetune it.

Full code is available here.

此 post 中没有显示任何代码,因为它又大又复杂。

终于解决了这个问题。我正在使用 third-party-attention layer by Thushan Ganegedara。使用它的 Attentionlayer class。并将其集成到我的架构中,如下所示。