贪心解码器 RNN 和 k=1 的波束解码器有什么区别？

Question

给定状态向量，我们可以通过连续生成每个输出以贪婪的方式递归解码序列，其中每个预测都以先前的输出为条件。我最近读了一篇论文，描述了在解码期间使用波束搜索，波束大小为 1 (k=1)。如果我们在每一步只保留最好的输出，这不是和贪心解码一样，并且提供 none 通常由波束搜索提供的好处吗？

Answer 1

终于找到答案了：beam size为1和greedy search一样

来自"Abstractive Sentence Summarization with Attentive Recurrent Neural Networks"：

"k refers to the size of the beam for generation; k = 1 implies greedy generation."

What's the difference between a greedy decoder RNN and a beam decoder with k=1?