"No gradients provided for any variable" 试图适应 Keras 顺序

"No gradients provided for any variable" when trying to fit Keras Sequential

我正在尝试像这样创建和训练顺序模型:

def model(training: Dataset, validation: Dataset):
    model = Sequential(layers=[Embedding(input_dim=1001, output_dim=16), Dropout(0.2), GlobalAveragePooling1D(), Dropout(0.2), Dense(1)])
    model.compile(loss=BinaryCrossentropy(from_logits=True), optimizer='adam', metrics=BinaryAccuracy(threshold=0.0))
    model.fit(x=training, validation_data=validation, epochs=10)

当我 运行 它时,我在 model.fit 行收到以下错误:

ValueError: No gradients provided for any variable: ['embedding/embeddings:0', 'dense/kernel:0', 'dense/bias:0'].

我遇到了一些关于使用优化器的答案,但如何将其应用于 Sequential 而不是 Model?我还缺少其他东西吗?

编辑:print(training)的结果:

<MapDataset shapes: ((None, 250), (None,)), types: (tf.int64, tf.int32)>

编辑:将使用 IMDB 示例数据重现错误的脚本

from tensorflow.keras import Sequential
from tensorflow import data
from keras.layers import TextVectorization
import tensorflow as tf
from tensorflow.keras.layers import Embedding, Dropout, GlobalAveragePooling1D, Dense
from tensorflow.keras.metrics import BinaryAccuracy, BinaryCrossentropy
import os


def split_dataset(dataset: data.Dataset):
    record_count = len(list(dataset))
    training_count = int((70 / 100) * record_count)
    validation_count = int((15 / 100) * record_count)

    raw_train_ds = dataset.take(training_count)
    raw_val_ds = dataset.skip(training_count).take(validation_count)
    raw_test_ds = dataset.skip(training_count + validation_count)

    return {"train": raw_train_ds, "test": raw_test_ds, "validate": raw_val_ds}


def clean(text, label):
    return tf.strings.unicode_transcode(text, "US ASCII", "UTF-8")


def vectorize_dataset(dataset: data.Dataset):
    return dataset.map(vectorize_text)


def vectorize_text(text, label):
    text = tf.expand_dims(text, -1)
    return vectorize_layer(text), label


url = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"
dataset_tar = tf.keras.utils.get_file("aclImdb_v1", url,
                                    untar=True, cache_dir='.',
                                    cache_subdir='')
dataset_dir = os.path.join(os.path.dirname(dataset_tar), 'aclImdb')

batch_size = 32
seed = 42
dataset = tf.keras.preprocessing.text_dataset_from_directory(
    'aclImdb/train',
    batch_size=batch_size,
    validation_split=0.2,
    subset='training',
    seed=seed)

split_data = split_dataset(dataset)
raw_train = split_data['train']
raw_val = split_data['validate']
raw_test = split_data['test']

vectorize_layer = TextVectorization(max_tokens=10000, output_mode="int", output_sequence_length=250, ngrams=1)
cleaned_text = raw_train.map(clean)
vectorize_layer.adapt(cleaned_text)

train = vectorize_dataset(raw_train)
test = vectorize_dataset(raw_test)
validate = vectorize_dataset(raw_val)


def model(training, validation):
    sequential_model = Sequential(
        layers=[Embedding(input_dim=1001, output_dim=16), Dropout(0.2), GlobalAveragePooling1D(), Dropout(0.2),
                Dense(1)])
    sequential_model.compile(loss=BinaryCrossentropy(from_logits=True), optimizer='adam', metrics=BinaryAccuracy(threshold=0.0))
    sequential_model.fit(x=training, validation_data=validation, epochs=10)


model(train, validate)

BinaryCrossentropy 是从 tf.keras.metrics 导入的,因此无法计算梯度。

正确的导入应该是 from tensorflow.keras.losses import BinaryCrossentropy

您的代码中的问题出现在以下行:

vectorize_layer = TextVectorization(max_tokens=10000, output_mode="int", output_sequence_length=250, ngrams=1)

TextVectorization层中的max_tokens对应词汇表中的total number of unique words

Embedding Layer: The Embedding layer can be understood as a lookup table that maps from integer indices (which stand for specific words) to dense vectors (their embeddings) .

在您的代码中,Embedding dimensions(1001,16),这意味着您只容纳将特定单词映射到 1001 的 运行ge 中的整数,任何构成(row, column) 对,对应大于 1001 的值不予处理。因此,ValueError.

我更改了 TextVectorization(max_tokens=5000)Embedding(5000, 16),以及 运行 你的代码。

我得到的结果如下图:

def model(training, validation):
   model = keras.Sequential(
    [
     layers.Embedding(input_dim=5000, output_dim=16),
     layers.Dropout(0.2),
     layers.GlobalAveragePooling1D(),
     layers.Dropout(0.2),
     layers.Dense(1),
     ]
     )
   model.compile(
    optimizer = keras.optimizers.Adam(),
    loss=keras.losses.BinaryCrossentropy(from_logits=True),
    metrics=keras.metrics.BinaryAccuracy(threshold=0.0)
              )
model.fit(x=training, validation_data=validation, epochs=10)
return model

Output:
Epoch 1/10 437/437 [==============================] - 10s 22ms/step - loss: 0.6797 - binary_accuracy: 0.6455 - val_loss: 0.6539 - val_binary_accuracy: 0.7554
Epoch 2/10 437/437 [==============================] - 10s 22ms/step - loss: 0.6109 - binary_accuracy: 0.7625 - val_loss: 0.5700 - val_binary_accuracy: 0.7880
Epoch 3/10 437/437 [==============================] - 9s 22ms/step - loss: 0.5263 - binary_accuracy: 0.8098 - val_loss: 0.4931 - val_binary_accuracy: 0.8233
Epoch 4/10 437/437 [==============================] - 10s 22ms/step - loss: 0.4580 - binary_accuracy: 0.8368 - val_loss: 0.4373 - val_binary_accuracy: 0.8448
Epoch 5/10 437/437 [==============================] - 10s 22ms/step - loss: 0.4072 - binary_accuracy: 0.8560 - val_loss: 0.4003 - val_binary_accuracy: 0.8522
Epoch 6/10 437/437 [==============================] - 10s 22ms/step - loss: 0.3717 - binary_accuracy: 0.8641 - val_loss: 0.3733 - val_binary_accuracy: 0.8589
Epoch 7/10 437/437 [==============================] - 10s 22ms/step - loss: 0.3451 - binary_accuracy: 0.8728 - val_loss: 0.3528 - val_binary_accuracy: 0.8582
Epoch 8/10 437/437 [==============================] - 9s 22ms/step - loss: 0.3220 - binary_accuracy: 0.8806 - val_loss: 0.3345 - val_binary_accuracy: 0.8673
Epoch 9/10 437/437 [==============================] - 9s 22ms/step - loss: 0.3048 - binary_accuracy: 0.8868 - val_loss: 0.3287 - val_binary_accuracy: 0.8673
Epoch 10/10 437/437 [==============================] - 10s 22ms/step - loss: 0.2891 - binary_accuracy: 0.8929 - val_loss: 0.3222 - val_binary_accuracy: 0.8679