使用 tf.function 装饰器时渐变为 None
Gradients are None when using tf.function decorator
我正在尝试将我的代码迁移到 tensorflow 2.0,但我无法使用 tf.function 创建显式图表。特别是,给定以下模型:
def new_dueling_model(name, input_size, output_size):
states = tf.keras.Input(shape=(input_size,))
h1 = tf.keras.layers.Dense(256, activation='relu')(states)
# State value function
value_h2 = tf.keras.layers.Dense(128, activation='relu')(h1)
value_output = tf.keras.layers.Dense(1)(value_h2)
# Advantage function
advantage_h2 = tf.keras.layers.Dense(128, activation='relu')(h1)
advantage_output = tf.keras.layers.Dense(output_size)(advantage_h2)
outputs = value_output + (advantage_output - tf.reduce_mean(advantage_output, axis=1, keepdims=True))
model = tf.keras.Model(inputs=states, outputs=outputs, name=name)
return model
并使用以下函数对其进行训练:
def q_train(states, actions, targets, is_weights, model, output_size, learning_rate, clip_grad):
optimizer = tf.keras.optimizers.RMSprop(learning_rate=learning_rate)
with tf.GradientTape() as tape:
outputs = model(states)
q_values = tf.multiply(outputs, (tf.one_hot(tf.squeeze(actions), output_size)))
loss_value = tf.reduce_mean(is_weights * tf.losses.mean_squared_error(targets, q_values))
grads = tape.gradient(loss_value, model.trainable_variables)
selected_q_values = tf.reduce_sum(q_values, axis=1)
selected_targets = tf.reduce_sum(targets, axis=1)
td_errors = tf.clip_by_value(selected_q_values - selected_targets, -1.0, 1.0)
if clip_grad:
optimizer.apply_gradients(zip([tf.clip_by_value(grad, -1.0, 1.0) for grad in grads], model.trainable_variables))
else:
optimizer.apply_gradients(zip(grads, model.trainable_variables))
return td_errors
我在主循环中有以下调用来训练模型:
# states, actions, targets and is_weights are numpy arrays
# model is created using new_dueling_model
td_errors = q_train(states, actions, targets, is_weights, model, num_actions, 0.00025, False)
# ...
一切正常,正如预期的那样,与 tf1.x 代码相比,训练步骤要慢得多。因此,我修饰了 q_train 函数以获得高性能的 tf 图。但是现在,每次我调用该函数时,梯度总是 None.
@tf.function
def q_train(...):
# ...
grads = tape.gradient(loss_value, model.trainable_variables)
# grads here are None
有什么问题?
我解决了这个问题。
首先,我使用以下方法安装了 nightly 包:
pip install tf-nightly-2.0-preview
此时,在 运行 代码后出现以下错误:
ValueError: tf.function-decorated function tried to create variables on non-first call
我通过创建优化器解决了这个新错误
optimizer = tf.keras.optimizers.RMSprop(learning_rate=learning_rate)
在 @tf.function 装饰函数之外,一切都按预期工作。
我正在尝试将我的代码迁移到 tensorflow 2.0,但我无法使用 tf.function 创建显式图表。特别是,给定以下模型:
def new_dueling_model(name, input_size, output_size):
states = tf.keras.Input(shape=(input_size,))
h1 = tf.keras.layers.Dense(256, activation='relu')(states)
# State value function
value_h2 = tf.keras.layers.Dense(128, activation='relu')(h1)
value_output = tf.keras.layers.Dense(1)(value_h2)
# Advantage function
advantage_h2 = tf.keras.layers.Dense(128, activation='relu')(h1)
advantage_output = tf.keras.layers.Dense(output_size)(advantage_h2)
outputs = value_output + (advantage_output - tf.reduce_mean(advantage_output, axis=1, keepdims=True))
model = tf.keras.Model(inputs=states, outputs=outputs, name=name)
return model
并使用以下函数对其进行训练:
def q_train(states, actions, targets, is_weights, model, output_size, learning_rate, clip_grad):
optimizer = tf.keras.optimizers.RMSprop(learning_rate=learning_rate)
with tf.GradientTape() as tape:
outputs = model(states)
q_values = tf.multiply(outputs, (tf.one_hot(tf.squeeze(actions), output_size)))
loss_value = tf.reduce_mean(is_weights * tf.losses.mean_squared_error(targets, q_values))
grads = tape.gradient(loss_value, model.trainable_variables)
selected_q_values = tf.reduce_sum(q_values, axis=1)
selected_targets = tf.reduce_sum(targets, axis=1)
td_errors = tf.clip_by_value(selected_q_values - selected_targets, -1.0, 1.0)
if clip_grad:
optimizer.apply_gradients(zip([tf.clip_by_value(grad, -1.0, 1.0) for grad in grads], model.trainable_variables))
else:
optimizer.apply_gradients(zip(grads, model.trainable_variables))
return td_errors
我在主循环中有以下调用来训练模型:
# states, actions, targets and is_weights are numpy arrays
# model is created using new_dueling_model
td_errors = q_train(states, actions, targets, is_weights, model, num_actions, 0.00025, False)
# ...
一切正常,正如预期的那样,与 tf1.x 代码相比,训练步骤要慢得多。因此,我修饰了 q_train 函数以获得高性能的 tf 图。但是现在,每次我调用该函数时,梯度总是 None.
@tf.function
def q_train(...):
# ...
grads = tape.gradient(loss_value, model.trainable_variables)
# grads here are None
有什么问题?
我解决了这个问题。 首先,我使用以下方法安装了 nightly 包:
pip install tf-nightly-2.0-preview
此时,在 运行 代码后出现以下错误:
ValueError: tf.function-decorated function tried to create variables on non-first call
我通过创建优化器解决了这个新错误
optimizer = tf.keras.optimizers.RMSprop(learning_rate=learning_rate)
在 @tf.function 装饰函数之外,一切都按预期工作。