在 Eager Execution 编程环境中训练定制的 CNN 模型
Training custom-made CNN model in eager execution programming environment
我利用Keras中"Model Sublclassing"的原理搭建了CNN模型。这是代表我的模型的 class:
class ConvNet(tf.keras.Model):
def __init__(self, data_format, classes):
super(ConvNet, self).__init__()
if data_format == "channels_first":
axis = 1
elif data_format == "channels_last":
axis = -1
self.conv_layer1 = tf.keras.layers.Conv2D(filters = 32, kernel_size = 3,strides = (1,1),
padding = "same",activation = "relu")
self.pool_layer1 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (2,2))
self.conv_layer2 = tf.keras.layers.Conv2D(filters = 64, kernel_size = 3,strides = (1,1),
padding = "same",activation = "relu")
self.pool_layer2 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (2,2))
self.conv_layer3 = tf.keras.layers.Conv2D(filters = 128, kernel_size = 5,strides = (1,1),
padding = "same",activation = "relu")
self.pool_layer3 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (1,1),
padding = "same")
self.flatten = tf.keras.layers.Flatten()
self.dense_layer1 = tf.keras.layers.Dense(units = 512, activation = "relu")
self.dense_layer2 = tf.keras.layers.Dense(units = classes, activation = "softmax")
def call(self, inputs, training = True):
output_tensor = self.conv_layer1(inputs)
output_tensor = self.pool_layer1(output_tensor)
output_tensor = self.conv_layer2(output_tensor)
output_tensor = self.pool_layer2(output_tensor)
output_tensor = self.conv_layer3(output_tensor)
output_tensor = self.pool_layer3(output_tensor)
output_tensor = self.flatten(output_tensor)
output_tensor = self.dense_layer1(output_tensor)
return self.dense_layer2(output_tensor)
我想知道如何训练它 "eagerly",我的意思是避免使用 compile
和 fit
方法。
我不确定如何准确构建训练循环。我知道我必须执行 tf.GradientTape.gradient()
函数来计算梯度,然后使用 optimizers.apply_gradients()
来更新我的模型参数。
我不明白的是如何用我的模型进行预测以获得 logits
然后用它们来计算损失。如果有人可以帮助我了解如何构建训练循环,我将不胜感激。
Eager execution 是命令式编程模式,让开发人员遵循 Python 的自然控制流程。本质上,您不需要先创建占位符、计算图,然后在 TensorFlow 会话中执行它们。您可以使用自动微分来计算训练循环中的梯度:
for i in range(iterations):
with tf.GradientTape() as tape:
logits = model(batch_examples, training = True)
loss = tf.losses.sparse_softmax_cross_entropy(batch_labels, logits)
grads = tape.gradient(loss, model.trainable_variables)
opt.apply_gradients([grads, model.trainable_variables])
这是假设 model
来自 Keras 的 class Model
。我希望这能解决你的问题!您还应该查看关于 Eager Execution 的 TensorFlow Guide。
我利用Keras中"Model Sublclassing"的原理搭建了CNN模型。这是代表我的模型的 class:
class ConvNet(tf.keras.Model):
def __init__(self, data_format, classes):
super(ConvNet, self).__init__()
if data_format == "channels_first":
axis = 1
elif data_format == "channels_last":
axis = -1
self.conv_layer1 = tf.keras.layers.Conv2D(filters = 32, kernel_size = 3,strides = (1,1),
padding = "same",activation = "relu")
self.pool_layer1 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (2,2))
self.conv_layer2 = tf.keras.layers.Conv2D(filters = 64, kernel_size = 3,strides = (1,1),
padding = "same",activation = "relu")
self.pool_layer2 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (2,2))
self.conv_layer3 = tf.keras.layers.Conv2D(filters = 128, kernel_size = 5,strides = (1,1),
padding = "same",activation = "relu")
self.pool_layer3 = tf.keras.layers.MaxPooling2D(pool_size = (2,2), strides = (1,1),
padding = "same")
self.flatten = tf.keras.layers.Flatten()
self.dense_layer1 = tf.keras.layers.Dense(units = 512, activation = "relu")
self.dense_layer2 = tf.keras.layers.Dense(units = classes, activation = "softmax")
def call(self, inputs, training = True):
output_tensor = self.conv_layer1(inputs)
output_tensor = self.pool_layer1(output_tensor)
output_tensor = self.conv_layer2(output_tensor)
output_tensor = self.pool_layer2(output_tensor)
output_tensor = self.conv_layer3(output_tensor)
output_tensor = self.pool_layer3(output_tensor)
output_tensor = self.flatten(output_tensor)
output_tensor = self.dense_layer1(output_tensor)
return self.dense_layer2(output_tensor)
我想知道如何训练它 "eagerly",我的意思是避免使用 compile
和 fit
方法。
我不确定如何准确构建训练循环。我知道我必须执行 tf.GradientTape.gradient()
函数来计算梯度,然后使用 optimizers.apply_gradients()
来更新我的模型参数。
我不明白的是如何用我的模型进行预测以获得 logits
然后用它们来计算损失。如果有人可以帮助我了解如何构建训练循环,我将不胜感激。
Eager execution 是命令式编程模式,让开发人员遵循 Python 的自然控制流程。本质上,您不需要先创建占位符、计算图,然后在 TensorFlow 会话中执行它们。您可以使用自动微分来计算训练循环中的梯度:
for i in range(iterations):
with tf.GradientTape() as tape:
logits = model(batch_examples, training = True)
loss = tf.losses.sparse_softmax_cross_entropy(batch_labels, logits)
grads = tape.gradient(loss, model.trainable_variables)
opt.apply_gradients([grads, model.trainable_variables])
这是假设 model
来自 Keras 的 class Model
。我希望这能解决你的问题!您还应该查看关于 Eager Execution 的 TensorFlow Guide。