nmt 中注意力解码器的余弦相似度

Question

我正在实现一个神经机器翻译模型，对于解码器部分（具有注意力机制），我想计算余弦相似度以找到分数。这是函数：

分数(a,b) = / ||a|| ||b||

就我而言：

        a = htilde_t (N, H)

        b = h (S, N, H)

        the output should be (S, N)

我对它们的尺寸感到困惑，我不知道如何在 pytorch 中解决它。

Answer 1

cos = nn.CosineSimilarity(dim=2, eps=1e-6)
output = cos(a.unsqueeze(0),b)

你需要取消挤压以添加一个幽灵维度来使两个输入具有相同的暗淡：

    Input1: (∗1,D,∗2) where D is at position dim

    Input2: (∗1,D,∗2) , same shape as the Input1

    Output: (∗1,∗2)

cosine similarity for attention decoder in nmt