keras 层 Masking() 和 Embedding(mask_zero =True) 之间有区别吗？

Is there is difference between the keras layers Masking() and Embedding(mask_zero =True)?

嵌入层的文档在这里：

屏蔽层的文档在此处：

我找不到那里的区别。在某些情况下应该首选其中一层吗？

我觉得Masking()更多的是对时间步长的掩蔽；而 Embedding(mask_zero=True) 更像是一个数据过滤器。屏蔽：

If all values in the input tensor at that timestep are equal to mask_value, then the timestep will be masked (skipped) in all downstream layers

随心所欲mask_value。因此，您可以根据您的数据决定跳过没有输入的时间步长，或者您可以想到的其他一些条件。

对于嵌入，您在输入为 0 的数据的输入跳过计算上覆盖一个掩码。这样，您可以在单个时间步长内通过网络传播完整数据、部分数据或无数据。这不是时间步骤#3 的掩码或类似的东西，它是输入数据#i 的掩码。另外，只有没有输入（输入=零）才能被屏蔽。

因此，我当然可以想到两者完全相等的情况（当输入 = 0 时，所有输入都为 0 就是这种情况），但它们的用途是在另一个分辨率上。