LSTM中的input_shape和batch_input_shape有什么区别

What's the difference between input_shape and batch_input_shape in LSTM

这只是同一事物的不同设置方式还是它们实际上具有不同的含义?跟网络配置有关系吗?

举一个简单的例子,我看不出有什么区别:

model = Sequential()
model.add(LSTM(1, batch_input_shape=(None,5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))

model = Sequential()
model.add(LSTM(1, input_shape=(5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))

然而,当我将批量大小设置为 12 batch_input_shape=(12,5,1) 并在拟合模型时使用 batch_size=10 时,出现错误。

ValueError: Cannot feed value of shape (10, 5, 1) for Tensor 'lstm_96_input:0', which has shape '(12, 5, 1)'

这显然是有道理的。但是,我认为在模型级别限制批量大小没有意义。

我是不是漏掉了什么?

Is it just a different way of setting the same thing or do they actually have different meanings? Does it have anything to do with network configuration?

是的,它们实际上是等效的,您的实验证实了这一点,另请参阅

However I can see no point in restricting the batch size on model level.

批量大小限制有时是必要的,我想到的例子是状态LSTM,其中批量中的最后一个单元状态被记住并用于初始化随后的批次。这确保客户端不会将不同的批量大小馈送到网络中。示例代码:

# Expected input batch shape: (batch_size, timesteps, data_dim)
# Note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
               batch_input_shape=(batch_size, timesteps, data_dim)))