LSTM中的input_shape和batch_input_shape有什么区别

Question

这只是同一事物的不同设置方式还是它们实际上具有不同的含义？跟网络配置有关系吗？

举一个简单的例子，我看不出有什么区别：

model = Sequential()
model.add(LSTM(1, batch_input_shape=(None,5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))

和

model = Sequential()
model.add(LSTM(1, input_shape=(5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))

然而，当我将批量大小设置为 12 batch_input_shape=(12,5,1) 并在拟合模型时使用 batch_size=10 时，出现错误。

ValueError: Cannot feed value of shape (10, 5, 1) for Tensor 'lstm_96_input:0', which has shape '(12, 5, 1)'

这显然是有道理的。但是，我认为在模型级别限制批量大小没有意义。

我是不是漏掉了什么？

Answer 1

Is it just a different way of setting the same thing or do they actually have different meanings? Does it have anything to do with network configuration?

是的，它们实际上是等效的，您的实验证实了这一点，另请参阅。

However I can see no point in restricting the batch size on model level.

批量大小限制有时是必要的，我想到的例子是状态LSTM，其中批量中的最后一个单元状态被记住并用于初始化随后的批次。这确保客户端不会将不同的批量大小馈送到网络中。示例代码：

# Expected input batch shape: (batch_size, timesteps, data_dim)
# Note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
               batch_input_shape=(batch_size, timesteps, data_dim)))

LSTM中的input_shape和batch_input_shape有什么区别

What's the difference between input_shape and batch_input_shape in LSTM

machine-learning

deep-learning

lstm

keras

rnn