是否有可能获得中间梯度? (张量流)
Is it possible to acquire an intermediate gradient? (Tensorflow)
使用渐变胶带时,您可以在使用后计算渐变:
with tf.GradientTape() as tape:
out = model(x, training=True)
out = tf.reshape(out, (num_img, 1, 10)) # Resizing
loss = tf.keras.losses.categorical_crossentropy(y, out)
gradient = tape.gradient(loss, model.trainable_variables)
但是,对于 cifar10 输入,这 returns 是输入图像的梯度。
有没有办法访问中间步骤的梯度,使它们经过“一些”训练?
编辑:感谢您的评论,我对您的问题有了更好的了解。
下面的代码远非理想,没有考虑批量训练等,但它可能会给你一个很好的起点。
我写了一个自定义训练步骤,基本上替代了 model.fit
方法。可能有更好的方法来执行此操作,但它应该可以让您快速比较梯度。
def custom_training(model, data):
x, y = data
# Training
with tf.GradientTape() as tape:
y_pred = model(x, training=True) # Forward pass
# Compute the loss value
# (the loss function is configured in `compile()`)
loss = tf.keras.losses.mse(y, y_pred)
trainable_vars = model.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
tf.keras.optimizers.Adam().apply_gradients(zip(gradients, trainable_vars))
# computing the gradient without optimizing it!
with tf.GradientTape() as tape:
y_pred = model(x, training=False) # Forward pass
# Compute the loss value
# (the loss function is configured in `compile()`)
loss = tf.keras.losses.mse(y, y_pred)
trainable_vars = model.trainable_variables
gradients_plus = tape.gradient(loss, trainable_vars)
return gradients, gradients_plus
让我们假设一个非常简单的模型:
import tensorflow as tf
train_data = tf.random.normal((1000, 32))
train_features = tf.random.normal((1000,))
inputs = tf.keras.layers.Input(shape=(32))
hidden_1 = tf.keras.layers.Dense(32)(inputs)
hidden_2 = tf.keras.layers.Dense(32)(hidden_1)
outputs = tf.keras.layers.Dense(1)(hidden_2)
model = tf.keras.Model(inputs, outputs)
并且您想计算所有层相对于输入的梯度。
您可以使用以下内容:
with tf.GradientTape(persistent=True) as tape:
tape.watch(inputs)
out_intermediate = []
inputs = train_data
cargo = model.layers[0](inputs)
for layer in model.layers[1:]:
cargo = layer(cargo)
out_intermediate.append(cargo)
for x in out_intermediate:
print(tape.gradient(x, inputs))
如果你想计算自定义损失,我推荐Customize what happens in Model.fit
使用渐变胶带时,您可以在使用后计算渐变:
with tf.GradientTape() as tape:
out = model(x, training=True)
out = tf.reshape(out, (num_img, 1, 10)) # Resizing
loss = tf.keras.losses.categorical_crossentropy(y, out)
gradient = tape.gradient(loss, model.trainable_variables)
但是,对于 cifar10 输入,这 returns 是输入图像的梯度。 有没有办法访问中间步骤的梯度,使它们经过“一些”训练?
编辑:感谢您的评论,我对您的问题有了更好的了解。
下面的代码远非理想,没有考虑批量训练等,但它可能会给你一个很好的起点。
我写了一个自定义训练步骤,基本上替代了 model.fit
方法。可能有更好的方法来执行此操作,但它应该可以让您快速比较梯度。
def custom_training(model, data):
x, y = data
# Training
with tf.GradientTape() as tape:
y_pred = model(x, training=True) # Forward pass
# Compute the loss value
# (the loss function is configured in `compile()`)
loss = tf.keras.losses.mse(y, y_pred)
trainable_vars = model.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
tf.keras.optimizers.Adam().apply_gradients(zip(gradients, trainable_vars))
# computing the gradient without optimizing it!
with tf.GradientTape() as tape:
y_pred = model(x, training=False) # Forward pass
# Compute the loss value
# (the loss function is configured in `compile()`)
loss = tf.keras.losses.mse(y, y_pred)
trainable_vars = model.trainable_variables
gradients_plus = tape.gradient(loss, trainable_vars)
return gradients, gradients_plus
让我们假设一个非常简单的模型:
import tensorflow as tf
train_data = tf.random.normal((1000, 32))
train_features = tf.random.normal((1000,))
inputs = tf.keras.layers.Input(shape=(32))
hidden_1 = tf.keras.layers.Dense(32)(inputs)
hidden_2 = tf.keras.layers.Dense(32)(hidden_1)
outputs = tf.keras.layers.Dense(1)(hidden_2)
model = tf.keras.Model(inputs, outputs)
并且您想计算所有层相对于输入的梯度。 您可以使用以下内容:
with tf.GradientTape(persistent=True) as tape:
tape.watch(inputs)
out_intermediate = []
inputs = train_data
cargo = model.layers[0](inputs)
for layer in model.layers[1:]:
cargo = layer(cargo)
out_intermediate.append(cargo)
for x in out_intermediate:
print(tape.gradient(x, inputs))
如果你想计算自定义损失,我推荐Customize what happens in Model.fit