为什么单个 LSTM 输出没有出现在 return_sequence 输出中？

Question

我想了解 LSTM 模型的工作原理，特别是在 keras 的 LSTM 层中使用 return_sequences 参数。作为一个简单的例子，参考下面的代码，我有点理解 LSTM(1) “为具有 3 个时间步长的输入序列输出一个隐藏状态” [1] and that LSTM(1, return_sequences=True) "returns a sequence of 3 values, one hidden state output for each input time step" [1].

然而，我的理解是 LSTM(1) 的单个输出与每个时间步相关，根据循环网络的功能，这意味着 LSTM(1, return_sequences=True) 的最终输出应该是与 LSTM(1)?

的单个输出相同

举个例子，参考下面的代码，我不明白为什么模型 2 的预测的最终输出不等于模型 1 的输出？当运行下面的相同代码只有一个 value/time 步骤时，我感到更加困惑，它还产生了两个不同的结果。我认为我的困惑来自于将多个单元格与多个时间步长混合在一起，但它仍然没有在我的大脑中点击。如果对此有任何澄清，我们将不胜感激！

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, LSTM
import numpy as np

# one feature per 3 time steps
data = np.array([0.1, 0.2, 0.3]).reshape((1,3,1))

# Input + a single LSTM (two versions)
input = Input(shape=(3,1))
lstm1 = LSTM(1)(input)
lstm2 = LSTM(1, return_sequences=True)(input)

model1 = Model(input, lstm1) # without return_sequences
model2 = Model(input, lstm2) # # with return_sequences

print(f'Model 1 without return_sequences:\n {model1.predict(data)}\n\n')
print(f'Model 2 with return_sequences:\n {model2.predict(data)}\n\n')

Returns:

Model 1 without return_sequences:  
[[-0.13452706]]
    
Model 2 with return_sequences:  
[[[0.01917788]   [0.05195162]   [0.09362084]]]

Answer 1

我终于想通了。我假设 LSTM(1, return_sequences=True) 的最终输出应该与 LSTM(1) 的单个输出相同是正确的，但由于随机种子，我得到了不同的结果！在每个 LSTM 层之前设置随机种子后，结果彼此相等，如下所示：

# Input + a single LSTM (two versions)
input = Input(shape=(3,1))
tf.random.set_seed(2)   # <-----
lstm1 = LSTM(1)(input)
tf.random.set_seed(2)   # <-----
lstm2 = LSTM(1, return_sequences=True)(input)

仔细检查：

model1.predict(data)[0] == model2.predict(data)[0][-1] # returns: True

为什么单个 LSTM 输出没有出现在 return_sequence 输出中？

Why does single LSTM output not appear in return_sequence output?

lstm

tensorflow