与 Tensorflow 和 LSTM 不兼容的层
Incompatible layer with Tensorflow and LSTM
我正在尝试创建一个网络来预测任意大小的时间序列(即 time_steps = None)。我正在测试不同的拓扑结构,但我想要一个 7 神经元输入层(输入中的时间序列有 7 个维度)和一个神经元输出层(预测值是一维的),在它们之间,我正在测试几个可变数量的 LSTM 层,每个层都有可变数量的神经元。我想使用 CuDNN(只是为了更快),所以我使用的参数有一些限制。有时我会收到这个奇怪的错误:
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 5)
重现问题的最少代码如下:
import tensorflow as tf
rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))
为什么会出现这个问题?如果我将 input_shape
参数更改为 (1, None, 5)
.
,则会显示完全相同的消息
在第一个 LSTM 层中将 return_sequences
更改为等于 True
。
我可能错了,但我相信 input_shape
虽然是 3D,但根据数据已经知道 batch_size
。这意味着其他评论建议的实际上是要输出 4D 数据。
下面是一个运行示例:
import tensorflow as tf
import numpy as np
rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False,
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))
# random data with the the (1, 7) shape
train = np.random.rand(10, 1, 7)
labels = np.random.randint(0, 1, 10)
rnn.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
rnn.fit(train, labels)
当你堆叠 LSTM 层时,你必须设置层的参数 return_sequence=True。只有LSTM层的最后一层必须return return_sequence=False.
import tensorflow as tf
rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False, # Changed Line
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 7))),
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False, # Changed Line
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False, # must keep return_sequences=False as the next layer is the Dense layer
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))
我正在尝试创建一个网络来预测任意大小的时间序列(即 time_steps = None)。我正在测试不同的拓扑结构,但我想要一个 7 神经元输入层(输入中的时间序列有 7 个维度)和一个神经元输出层(预测值是一维的),在它们之间,我正在测试几个可变数量的 LSTM 层,每个层都有可变数量的神经元。我想使用 CuDNN(只是为了更快),所以我使用的参数有一些限制。有时我会收到这个奇怪的错误:
ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 5)
重现问题的最少代码如下:
import tensorflow as tf
rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))
为什么会出现这个问题?如果我将 input_shape
参数更改为 (1, None, 5)
.
在第一个 LSTM 层中将 return_sequences
更改为等于 True
。
我可能错了,但我相信 input_shape
虽然是 3D,但根据数据已经知道 batch_size
。这意味着其他评论建议的实际上是要输出 4D 数据。
下面是一个运行示例:
import tensorflow as tf
import numpy as np
rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False,
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))
# random data with the the (1, 7) shape
train = np.random.rand(10, 1, 7)
labels = np.random.randint(0, 1, 10)
rnn.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
rnn.fit(train, labels)
当你堆叠 LSTM 层时,你必须设置层的参数 return_sequence=True。只有LSTM层的最后一层必须return return_sequence=False.
import tensorflow as tf
rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False, # Changed Line
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 7))),
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False, # Changed Line
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False, # must keep return_sequences=False as the next layer is the Dense layer
recurrent_activation='sigmoid', use_bias=True, time_major=True,
recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))