ValueError: Unexpected result of `train_function` (Empty logs). for RNN

Question

我正在重现 Aurélien Géron 的动手机器学习一书第 16 章的示例，并在尝试训练简单的 RNN 模型时发现错误。

错误如下：

ValueError: Unexpected result of `train_function` (Empty logs). Please use `Model.compile(..., run_eagerly=True)`, or `tf.config.run_functions_eagerly(True)` for more information of where went wrong, or file a issue/bug to `tf.keras`.

用于检索数据和预处理的代码：

shakespeare_url = 'https://homl.info/shakespeare'
filepath = utils.get_file('shakespeare.txt', shakespeare_url)
with open(filepath) as f:
    shakespeare_text = f.read()

# Let's tokenize the text at characters level
tokenizer = preprocessing.text.Tokenizer(char_level=True)
tokenizer.fit_on_texts([shakespeare_text])

# Number of distinct characters
max_id = len(tokenizer.word_index)
# total number of characters
dataset_size = tokenizer.document_count

# Lets encode the full text and substract 1 to have a range of 0-38 instead of 1-39
[encoded] = np.array(tokenizer.texts_to_sequences([shakespeare_text])) - 1

# Let's use the first 90% of the data to train the model 
train_size = dataset_size * 90 // 100
dataset = tf.data.Dataset.from_tensor_slices(encoded[:train_size])

n_steps = 100
window_length = n_steps + 1 # 100 steps plus the target
dataset = dataset.window(window_length, shift=1, drop_remainder=True)
# Let's flat our windows dataset into tensors to pass to the model 
dataset = dataset.flat_map(lambda window: window.batch(window_length))
# Let's shuffle the windows
batch_size = 32
dataset = dataset.shuffle(10000).batch(batch_size)
dataset = dataset.map(lambda windows: (windows[:, :-1], windows[:, 1:]), num_parallel_calls=AUTOTUNE)
# Encoding the categories as one-hot encoding since the categories are relatively few (39)
dataset = dataset.map(lambda x_batch, y_batch: (tf.one_hot(x_batch, depth=max_id), y_batch), num_parallel_calls=AUTOTUNE)
dataset = dataset.prefetch(AUTOTUNE)

模型代码如下：

model = models.Sequential([
    layers.GRU(128, return_sequences=True, input_shape=[None, max_id], dropout=0.2, recurrent_dropout=0.2),
    layers.GRU(128, return_sequences=True, dropout=0.2, recurrent_dropout=0.2),
    layers.TimeDistributed(layers.Dense(max_id, activation='softmax'))
])

model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['sparse_categorical_crossentropy'])

history = model.fit(dataset, epochs=20)

如有需要，请随时索取更多信息。提前致谢。

Answer 1

问题在于 tokenizer.document_count 将整个文本视为一个数据条目，这就是为什么 dataset_size 等于 1 而 train_size 因此等于 0，从而导致数据集为空。尝试使用 encoded 数组来获取数据条目的真实数量：

import tensorflow as tf
import numpy as np

shakespeare_url = 'https://homl.info/shakespeare'
filepath = tf.keras.utils.get_file('shakespeare.txt', shakespeare_url)
with open(filepath) as f:
    shakespeare_text = f.read()

# Let's tokenize the text at characters level
tokenizer = tf.keras.preprocessing.text.Tokenizer(char_level=True)
tokenizer.fit_on_texts([shakespeare_text])

# Number of distinct characters
max_id = len(tokenizer.word_index)
# total number of characters

# Lets encode the full text and substract 1 to have a range of 0-38 instead of 1-39
[encoded] = np.array(tokenizer.texts_to_sequences([shakespeare_text])) - 1

# Let's use the first 90% of the data to train the model
dataset_size = encoded.shape[0]
train_size = dataset_size * 90 // 100
dataset = tf.data.Dataset.from_tensor_slices(encoded[:train_size])

n_steps = 100
window_length = n_steps + 1 # 100 steps plus the target
dataset = dataset.window(window_length, shift=1, drop_remainder=True)
# Let's flat our windows dataset into tensors to pass to the model 
dataset = dataset.flat_map(lambda window: window.batch(window_length))
# Let's shuffle the windows
batch_size = 32
dataset = dataset.shuffle(10000).batch(batch_size)
dataset = dataset.map(lambda windows: (windows[:, :-1], windows[:, 1:]), num_parallel_calls=tf.data.AUTOTUNE)
# Encoding the categories as one-hot encoding since the categories are relatively few (39)
dataset = dataset.map(lambda x_batch, y_batch: (tf.one_hot(x_batch, depth=max_id), y_batch), num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.prefetch(tf.data.AUTOTUNE)

model = tf.keras.Sequential([
    tf.keras.layers.GRU(128, return_sequences=True, input_shape=[None, max_id], dropout=0.2, recurrent_dropout=0.2),
    tf.keras.layers.GRU(128, return_sequences=True, dropout=0.2, recurrent_dropout=0.2),
    tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(max_id, activation='softmax'))
])
print(model.summary())
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['sparse_categorical_crossentropy'])

history = model.fit(dataset, epochs=20)

ValueError: Unexpected result of `train_function` (Empty logs). for RNN

ValueError: Unexpected result of `train_function` (Empty logs). for RNN

python

keras

tensorflow

recurrent-neural-network

tensorflow2.x