使用 GradientTape 时，Tensorflow 梯度总是给出 None

Question

我一直在玩弄并尝试在 TensorFlow 中实现我自己的损失函数，但我总是得到 None 梯度。为了重现这个问题，我现在将我的程序缩减为一个最小的例子。我定义了一个非常简单的模型：

import tensorflow as tf

model = tf.keras.Sequential(
    [
        tf.keras.Input(shape=(3,), name="input"),
        tf.keras.layers.Dense(64, activation="relu", name="layer2"),
        tf.keras.layers.Dense(3, activation="softmax", name="output"),
    ]
)

然后定义一个非常简单（但可能没用）的损失函数：

def dummy_loss(x):
  return tf.reduce_sum(x)

def train(model, inputs, learning_rate):
  outputs = model(inputs)
  with tf.GradientTape() as t:
    current_loss = dummy_loss(outputs)
  temp = t.gradient(current_loss, model.trainable_weights)
train(model, tf.random.normal((10, 3)), learning_rate=0.001)

但是 t.gradient(current_loss, model.trainable_weights) 只给我一个 None 值的列表，即 [None, None, None, None]。为什么会这样？我究竟做错了什么？我这边可能对 TensorFlow 的工作原理有误解吗？

Answer 1

您需要运行（即正向传递）计算图或模型 在 GradientTape 的上下文 中，以便模型中的所有操作可以记录：

  with tf.GradientTape() as t:
    outputs = model(inputs)  # This line should be within context manager
    current_loss = dummy_loss(outputs)

使用 GradientTape 时，Tensorflow 梯度总是给出 None

Tensorflow gradient always gives None when using GradientTape

python

machine-learning

keras

tensorflow

gradienttape