检查目标时出错:预期密集有 3 个维度,但得到形状为 (32, 200) 的数组

Error when checking target: expected dense to have 3 dimensions, but got array with shape (32, 200)

我正在尝试修改 https://www.tensorflow.org/tutorials/sequences/text_generation 上的示例以生成基于字符的文本。

示例中的代码使用了 Tensorflow Eager Execution(通过 tensorflow.enable_eager_execution)并且运行良好,但是如果我禁用 Eager Execution,我就会开始收到此错误:

Error when checking target: expected dense to have 3 dimensions, but got array with shape (32, 200)

为什么会这样?代码是否应该在启用或不启用 Eager 的情况下以相同的方式工作?

我尝试展平 LSTM 层的输出,但出现类似的错误:

ValueError: Error when checking target: expected dense to have shape (1,) but got array with shape (200,)

我能做的最简单的代码如下:

import tensorflow as tf
import numpy as np

# tf.enable_eager_execution()

def get_input():
    path_to_file = tf.keras.utils.get_file(
        'shakespeare.txt',
        'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt'
    )
    with open(path_to_file) as f:
        text = f.read()
    return text


def get_dataset(text_as_indexes, sequence_size, sequences_per_batch):
    def split_input(sequence):
        return sequence[:-1], sequence[1:]

    data_set = tf.data.Dataset.from_tensor_slices(text_as_indexes)
    data_set = data_set.batch(sequence_size + 1, drop_remainder=True)
    data_set = data_set.map(split_input)
    data_set = data_set.shuffle(10000).batch(sequences_per_batch, drop_remainder=True)
    return data_set


if __name__ == '__main__':
    sequences_len = 200
    batch_size = 32
    embeddings_size = 64
    rnn_units = 128

    text = get_input()
    vocab = sorted(set(text))
    vocab_size = len(vocab)

    char2int = {c: i for i, c in enumerate(vocab)}
    int2char = np.array(vocab)
    text_as_int = np.array([char2int[c] for c in text])

    dataset = get_dataset(text_as_int, sequences_len, batch_size)
    steps_per_epoch = len(text_as_int) // sequences_len // batch_size

    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Embedding(
        input_dim=vocab_size,
        output_dim=embeddings_size,
        input_length=sequences_len))

    model.add(tf.keras.layers.LSTM(
        units=rnn_units,
        return_sequences=True))

    model.add(tf.keras.layers.Dense(units=vocab_size, activation='softmax'))

    model.compile(optimizer=tf.train.AdamOptimizer(),
                  loss='sparse_categorical_crossentropy')

    model.summary()
    model.fit(
        x=dataset.repeat(),
        batch_size=batch_size,
        steps_per_epoch=steps_per_epoch)

使用 sparse_categorical_crossentropy 时,标签的形状应为 (batch_size, sequence_length, 1) 而不是简单的 (batch_size, sequence_length)。您可以通过以下方式解决此问题 按如下方式重塑 split_input() 函数中的标签:

def split_input(sequence):
    return sequence[:-1], tf.reshape(sequence[1:], (-1,1))

以上代码适用于急切执行和正常执行。