LSTM中的input_shape和batch_input_shape有什么区别
What's the difference between input_shape and batch_input_shape in LSTM
这只是同一事物的不同设置方式还是它们实际上具有不同的含义?跟网络配置有关系吗?
举一个简单的例子,我看不出有什么区别:
model = Sequential()
model.add(LSTM(1, batch_input_shape=(None,5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))
和
model = Sequential()
model.add(LSTM(1, input_shape=(5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))
然而,当我将批量大小设置为 12 batch_input_shape=(12,5,1)
并在拟合模型时使用 batch_size=10
时,出现错误。
ValueError: Cannot feed value of shape (10, 5, 1) for Tensor
'lstm_96_input:0', which has shape '(12, 5, 1)'
这显然是有道理的。但是,我认为在模型级别限制批量大小没有意义。
我是不是漏掉了什么?
Is it just a different way of setting the same thing or do they actually have different meanings? Does it have anything to do with network configuration?
是的,它们实际上是等效的,您的实验证实了这一点,另请参阅 。
However I can see no point in restricting the batch size on model level.
批量大小限制有时是必要的,我想到的例子是状态LSTM,其中批量中的最后一个单元状态被记住并用于初始化随后的批次。这确保客户端不会将不同的批量大小馈送到网络中。示例代码:
# Expected input batch shape: (batch_size, timesteps, data_dim)
# Note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, data_dim)))
这只是同一事物的不同设置方式还是它们实际上具有不同的含义?跟网络配置有关系吗?
举一个简单的例子,我看不出有什么区别:
model = Sequential()
model.add(LSTM(1, batch_input_shape=(None,5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))
和
model = Sequential()
model.add(LSTM(1, input_shape=(5,1), return_sequences=True))
model.add(LSTM(1, return_sequences=False))
然而,当我将批量大小设置为 12 batch_input_shape=(12,5,1)
并在拟合模型时使用 batch_size=10
时,出现错误。
ValueError: Cannot feed value of shape (10, 5, 1) for Tensor 'lstm_96_input:0', which has shape '(12, 5, 1)'
这显然是有道理的。但是,我认为在模型级别限制批量大小没有意义。
我是不是漏掉了什么?
Is it just a different way of setting the same thing or do they actually have different meanings? Does it have anything to do with network configuration?
是的,它们实际上是等效的,您的实验证实了这一点,另请参阅
However I can see no point in restricting the batch size on model level.
批量大小限制有时是必要的,我想到的例子是状态LSTM,其中批量中的最后一个单元状态被记住并用于初始化随后的批次。这确保客户端不会将不同的批量大小馈送到网络中。示例代码:
# Expected input batch shape: (batch_size, timesteps, data_dim)
# Note that we have to provide the full batch_input_shape since the network is stateful.
# the sample of index i in batch k is the follow-up for the sample i in batch k-1.
model = Sequential()
model.add(LSTM(32, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, data_dim)))