与 Tensorflow 和 LSTM 不兼容的层

Incompatible layer with Tensorflow and LSTM

我正在尝试创建一个网络来预测任意大小的时间序列(即 time_steps = None)。我正在测试不同的拓扑结构,但我想要一个 7 神经元输入层(输入中的时间序列有 7 个维度)和一个神经元输出层(预测值是一维的),在它们之间,我正在测试几个可变数量的 LSTM 层,每个层都有可变数量的神经元。我想使用 CuDNN(只是为了更快),所以我使用的参数有一些限制。有时我会收到这个奇怪的错误:

ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 5)

重现问题的最少代码如下:

import tensorflow as tf

rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
                             recurrent_activation='sigmoid', use_bias=True, time_major=True,
                             recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
                             recurrent_activation='sigmoid', use_bias=True, time_major=True,
                             recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))

为什么会出现这个问题?如果我将 input_shape 参数更改为 (1, None, 5).

,则会显示完全相同的消息

在第一个 LSTM 层中将 return_sequences 更改为等于 True

我可能错了,但我相信 input_shape 虽然是 3D,但根据数据已经知道 batch_size。这意味着其他评论建议的实际上是要输出 4D 数据。

下面是一个运行示例:

import tensorflow as tf
import numpy as np
rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False,
                             recurrent_activation='sigmoid', use_bias=True, time_major=True,
                             recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False,
                             recurrent_activation='sigmoid', use_bias=True, time_major=True,
                             recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))

# random data with the the (1, 7) shape
train = np.random.rand(10, 1, 7)
labels = np.random.randint(0, 1, 10)


rnn.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

rnn.fit(train, labels)

当你堆叠 LSTM 层时,你必须设置层的参数 return_sequence=True。只有LSTM层的最后一层必须return return_sequence=False.

import tensorflow as tf

rnn = tf.keras.models.Sequential()
rnn.add(tf.keras.layers.Input(shape=(1, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False, # Changed Line
                             recurrent_activation='sigmoid', use_bias=True, time_major=True,
                             recurrent_dropout=0, stateful=False, input_shape=(None, 7))),
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=True, unroll=False, # Changed Line
                             recurrent_activation='sigmoid', use_bias=True, time_major=True,
                             recurrent_dropout=0, stateful=False, input_shape=(None, 7)))
rnn.add(tf.keras.layers.LSTM(5, activation="tanh", return_sequences=False, unroll=False, # must keep return_sequences=False as the next layer is the Dense layer
                             recurrent_activation='sigmoid', use_bias=True, time_major=True,
                             recurrent_dropout=0, stateful=False, input_shape=(None, 5)))
rnn.add(tf.keras.layers.Dense(1, activation="linear"))