如何检查 Keras 模型的可训练权重是否发生变化 Before/After 自定义训练循环
How to Check if Trainable Weights of Keras Model Change Before/After Custom Training Loop
我正在尝试验证自定义训练循环是否会更改 Keras 模型的权重。我目前的方法是在训练前 deepcopy
model.trainable_weights
列表,然后在训练后将其与 model.trainable_weights
进行比较。这是进行比较的有效方法吗?我的方法的结果表明权重实际上发生了变化(这是预期的结果,因为每个时期的损失明显减少),但我只是想验证我所做的是否有效。下面是稍微改编的代码 Keras custom training loop tutorial 加上我用来比较权重变化的代码 before/after 模型训练:
# Imports
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
from copy import deepcopy
# The model
inputs = keras.Input(shape=(784,), name="digits")
x1 = layers.Dense(64, activation="relu")(inputs)
x2 = layers.Dense(64, activation="relu")(x1)
outputs = layers.Dense(10, name="predictions")(x2)
model = keras.Model(inputs=inputs, outputs=outputs)
##########################
# WEIGHTS BEFORE TRAINING
##########################
# I use deepcopy here to avoid mutating the weights list during training
weights_before_training = deepcopy(model.trainable_weights)
##########################
# Keras Tutorial
##########################
# Load data
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = np.reshape(x_train, (-1, 784))
x_test = np.reshape(x_test, (-1, 784))
# Reduce the size of the data to speed up training
x_train = x_train[:128]
x_test = x_test[:128]
y_train = y_train[:128]
y_test = y_test[:128]
# Make tf dataset
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=64).batch(16)
# The training loop
print('Begin Training')
optimizer = keras.optimizers.SGD(learning_rate=1e-3)
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
epochs = 2
for epoch in range(epochs):
# Logging start of epoch
print("\nStart of epoch %d" % (epoch,))
# Save loss values for logging
loss_values = []
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
with tf.GradientTape() as tape:
logits = model(x_batch_train, training=True) # Logits for this minibatch
loss_value = loss_fn(y_batch_train, logits)
# Append to list for logging
loss_values.append(loss_value)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
print('Epoch Loss:', np.mean(loss_values))
print('End Training')
##########################
# WEIGHTS AFTER TRAINING
##########################
weights_after_training = model.trainable_weights
# Note: `trainable_weights` is a list of kernel and bias tensors.
print()
print('Begin Trainable Weights Comparison')
for i in range(len(weights_before_training)):
print(f'Trainable Tensors for Element {i + 1} of List Are Equal:', tf.reduce_all(tf.equal(weights_before_training[i], weights_after_training[i])).numpy())
print('End Trainable Weights Comparison')
>>> Begin Training
>>> Start of epoch 0
>>> Epoch Loss: 44.66055
>>>
>>> Start of epoch 1
>>> Epoch Loss: 5.306543
>>> End Training
>>>
>>> Begin Trainable Weights Comparison
>>> Trainable Tensors for Element 1 of List Are Equal : False
>>> Trainable Tensors for Element 2 of List Are Equal : False
>>> Trainable Tensors for Element 3 of List Are Equal : False
>>> Trainable Tensors for Element 4 of List Are Equal : False
>>> Trainable Tensors for Element 5 of List Are Equal : False
>>> Trainable Tensors for Element 6 of List Are Equal : False
>>> End Trainable Weights Comparison
从评论中总结并添加更多信息,以造福于社区:
上面代码中遵循的方法,即比较Training
之前的deepcopy(model.trainable_weights)
和model.trainable_weights
,在使用 Custom Training Loop
训练 Model
之后,是 正确的方法。
除此之外,如果我们不想训练模型,我们可以冻结所有Layers
的 Model
使用代码,
model.trainable = false
.
我正在尝试验证自定义训练循环是否会更改 Keras 模型的权重。我目前的方法是在训练前 deepcopy
model.trainable_weights
列表,然后在训练后将其与 model.trainable_weights
进行比较。这是进行比较的有效方法吗?我的方法的结果表明权重实际上发生了变化(这是预期的结果,因为每个时期的损失明显减少),但我只是想验证我所做的是否有效。下面是稍微改编的代码 Keras custom training loop tutorial 加上我用来比较权重变化的代码 before/after 模型训练:
# Imports
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
from copy import deepcopy
# The model
inputs = keras.Input(shape=(784,), name="digits")
x1 = layers.Dense(64, activation="relu")(inputs)
x2 = layers.Dense(64, activation="relu")(x1)
outputs = layers.Dense(10, name="predictions")(x2)
model = keras.Model(inputs=inputs, outputs=outputs)
##########################
# WEIGHTS BEFORE TRAINING
##########################
# I use deepcopy here to avoid mutating the weights list during training
weights_before_training = deepcopy(model.trainable_weights)
##########################
# Keras Tutorial
##########################
# Load data
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = np.reshape(x_train, (-1, 784))
x_test = np.reshape(x_test, (-1, 784))
# Reduce the size of the data to speed up training
x_train = x_train[:128]
x_test = x_test[:128]
y_train = y_train[:128]
y_test = y_test[:128]
# Make tf dataset
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=64).batch(16)
# The training loop
print('Begin Training')
optimizer = keras.optimizers.SGD(learning_rate=1e-3)
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
epochs = 2
for epoch in range(epochs):
# Logging start of epoch
print("\nStart of epoch %d" % (epoch,))
# Save loss values for logging
loss_values = []
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
with tf.GradientTape() as tape:
logits = model(x_batch_train, training=True) # Logits for this minibatch
loss_value = loss_fn(y_batch_train, logits)
# Append to list for logging
loss_values.append(loss_value)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
print('Epoch Loss:', np.mean(loss_values))
print('End Training')
##########################
# WEIGHTS AFTER TRAINING
##########################
weights_after_training = model.trainable_weights
# Note: `trainable_weights` is a list of kernel and bias tensors.
print()
print('Begin Trainable Weights Comparison')
for i in range(len(weights_before_training)):
print(f'Trainable Tensors for Element {i + 1} of List Are Equal:', tf.reduce_all(tf.equal(weights_before_training[i], weights_after_training[i])).numpy())
print('End Trainable Weights Comparison')
>>> Begin Training
>>> Start of epoch 0
>>> Epoch Loss: 44.66055
>>>
>>> Start of epoch 1
>>> Epoch Loss: 5.306543
>>> End Training
>>>
>>> Begin Trainable Weights Comparison
>>> Trainable Tensors for Element 1 of List Are Equal : False
>>> Trainable Tensors for Element 2 of List Are Equal : False
>>> Trainable Tensors for Element 3 of List Are Equal : False
>>> Trainable Tensors for Element 4 of List Are Equal : False
>>> Trainable Tensors for Element 5 of List Are Equal : False
>>> Trainable Tensors for Element 6 of List Are Equal : False
>>> End Trainable Weights Comparison
从评论中总结并添加更多信息,以造福于社区:
上面代码中遵循的方法,即比较Training
之前的deepcopy(model.trainable_weights)
和model.trainable_weights
,在使用 Custom Training Loop
训练 Model
之后,是 正确的方法。
除此之外,如果我们不想训练模型,我们可以冻结所有Layers
的 Model
使用代码,
model.trainable = false
.