如何在 Tensoflow.Keras 中将可训练参数带入损失函数
How to take the trainable parameters into a loss function in Tensoflow.Keras
我正在尝试实现一个损失函数,其中计算需要卷积层中的变量。官方文档给出了一种在损失函数中涉及变量的方法:
If this is not the case for your loss (if, for example, your loss
references a Variable of one of the model's layers), you can wrap your
loss in a zero-argument lambda. These losses are not tracked as part
of the model's topology since they can't be serialized.
inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Weight regularization.
model.add_loss(lambda: tf.reduce_mean(x.kernel))
然而,这只是向模型添加了一个简单的正则化。有没有办法实现更复杂的正则化器,涉及不同层中变量之间的计算?如果将可训练变量也添加到正则化器中会怎么样?
您可以使用 add_loss
API 添加任意复杂的损失函数。这是添加使用两个不同层的权重的损失的示例。
import tensorflow as tf
print('TensorFlow:', tf.__version__)
inp = tf.keras.Input(shape=[10])
x = tf.keras.layers.Dense(16)(inp)
x = tf.keras.layers.Dense(32)(x)
x = tf.keras.layers.Dense(4)(x)
out = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs=[inp], outputs=[out])
model.summary()
def custom_loss(weight_a, weight_b):
def _custom_loss():
# This can include any arbitrary logic
loss = tf.norm(weight_a) + tf.norm(weight_b)
return loss
return _custom_loss
weight_a = model.layers[2].kernel
weight_b = model.layers[3].kernel
model.add_loss(custom_loss(weight_a, weight_b))
print('\nlosses:', model.losses)
输出:
TensorFlow: 2.3.0-dev20200611
Model: "functional_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 10)] 0
_________________________________________________________________
dense (Dense) (None, 16) 176
_________________________________________________________________
dense_1 (Dense) (None, 32) 544
_________________________________________________________________
dense_2 (Dense) (None, 4) 132
_________________________________________________________________
dense_3 (Dense) (None, 1) 5
=================================================================
Total params: 857
Trainable params: 857
Non-trainable params: 0
_________________________________________________________________
losses: [<tf.Tensor: shape=(), dtype=float32, numpy=7.3701963>]
受@Srihari Humbarwadi 的启发,我找到了一种实现复杂正则化的方法,其中涉及:
- 为正则化损失添加一个可训练参数
- 不同层权重之间的自定义计算
思路是构建子类模型:
class Pseudo_Model(Model):
def __init__(self, **kwargs):
super(Pseudo_Model, self).__init__(**kwargs)
self.dense1 = Dense(16)
self.dense2 = Dense(4)
self.dense3 = Dense(2)
self.a = tf.Variable(shape=(1,), initial_value=tf.ones(shape=(1,)))
def call(self, inputs, training=True, mask=None):
x = self.dense1(inputs)
x = self.dense2(x)
x = self.dense3(x)
return x
模型建立于:
sub_model = Pseudo_Model(name='sub_model')
inputs = Input(shape=(32,))
outputs = sub_model(inputs)
model = Model(inputs, outputs)
model.summary()
model.get_layer('sub_model').summary()
模型结构:
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 32)] 0
_________________________________________________________________
sub_model (Pseudo_Model) (None, 2) 607
=================================================================
Total params: 607
Trainable params: 607
Non-trainable params: 0
_________________________________________________________________
Model: "sub_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 16) 528
_________________________________________________________________
dense_1 (Dense) (None, 4) 68
_________________________________________________________________
dense_2 (Dense) (None, 2) 10
=================================================================
Total params: 607
Trainable params: 607
Non-trainable params: 0
_________________________________________________________________
然后按照@Srihari Humbarwadi 提到的那样定义损失函数,只添加一个新的可训练参数a:
def custom_loss(weight_a, weight_b, a):
def _custom_loss():
# This can include any arbitrary logic
loss = a * tf.norm(weight_a) + tf.norm(weight_b)
return loss
return _custom_loss
通过add_loss() API:
将损失添加到模型中
a_ = model.get_layer('sub_model').a
weighta = model.get_layer('sub_model').layers[0].kernel
weightb = model.get_layer('sub_model').layers[1].kernel
model.get_layer('sub_model').add_loss(custom_loss(weighta, weightb, a_))
print(model.losses)
#[<tf.Tensor: id=116, shape=(1,), dtype=float32, numpy=array([7.2659254], dtype=float32)>]
然后我创建一个假数据集来测试它:
fake_data = np.random.rand(1000, 32)
fake_labels = np.random.rand(1000, 2)
model.compile(optimizer=tf.keras.optimizers.SGD(), loss='mse')
model.fit(x=fake_data, y=fake_labels, epochs=5)
print(model.get_layer(name='sub_model').a)
如您所见,正在更新变量和损失:
Train on 1000 samples
Epoch 1/5
2020-06-19 19:21:02.475464: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
1000/1000 - 1s - loss: 3.9039
Epoch 2/5
1000/1000 - 0s - loss: -3.0905e+00
Epoch 3/5
1000/1000 - 0s - loss: -1.2103e+01
Epoch 4/5
1000/1000 - 0s - loss: -2.6855e+01
Epoch 5/5
1000/1000 - 0s - loss: -5.3408e+01
<tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([-8.13609], dtype=float32)>
Process finished with exit code 0
但是,这仍然是一个非常棘手的方法。不知道有没有更优雅稳定的方式来实现同样的功能
我正在尝试实现一个损失函数,其中计算需要卷积层中的变量。官方文档给出了一种在损失函数中涉及变量的方法:
If this is not the case for your loss (if, for example, your loss references a Variable of one of the model's layers), you can wrap your loss in a zero-argument lambda. These losses are not tracked as part of the model's topology since they can't be serialized.
inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Weight regularization.
model.add_loss(lambda: tf.reduce_mean(x.kernel))
然而,这只是向模型添加了一个简单的正则化。有没有办法实现更复杂的正则化器,涉及不同层中变量之间的计算?如果将可训练变量也添加到正则化器中会怎么样?
您可以使用 add_loss
API 添加任意复杂的损失函数。这是添加使用两个不同层的权重的损失的示例。
import tensorflow as tf
print('TensorFlow:', tf.__version__)
inp = tf.keras.Input(shape=[10])
x = tf.keras.layers.Dense(16)(inp)
x = tf.keras.layers.Dense(32)(x)
x = tf.keras.layers.Dense(4)(x)
out = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs=[inp], outputs=[out])
model.summary()
def custom_loss(weight_a, weight_b):
def _custom_loss():
# This can include any arbitrary logic
loss = tf.norm(weight_a) + tf.norm(weight_b)
return loss
return _custom_loss
weight_a = model.layers[2].kernel
weight_b = model.layers[3].kernel
model.add_loss(custom_loss(weight_a, weight_b))
print('\nlosses:', model.losses)
输出:
TensorFlow: 2.3.0-dev20200611
Model: "functional_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 10)] 0
_________________________________________________________________
dense (Dense) (None, 16) 176
_________________________________________________________________
dense_1 (Dense) (None, 32) 544
_________________________________________________________________
dense_2 (Dense) (None, 4) 132
_________________________________________________________________
dense_3 (Dense) (None, 1) 5
=================================================================
Total params: 857
Trainable params: 857
Non-trainable params: 0
_________________________________________________________________
losses: [<tf.Tensor: shape=(), dtype=float32, numpy=7.3701963>]
受@Srihari Humbarwadi 的启发,我找到了一种实现复杂正则化的方法,其中涉及:
- 为正则化损失添加一个可训练参数
- 不同层权重之间的自定义计算
思路是构建子类模型:
class Pseudo_Model(Model):
def __init__(self, **kwargs):
super(Pseudo_Model, self).__init__(**kwargs)
self.dense1 = Dense(16)
self.dense2 = Dense(4)
self.dense3 = Dense(2)
self.a = tf.Variable(shape=(1,), initial_value=tf.ones(shape=(1,)))
def call(self, inputs, training=True, mask=None):
x = self.dense1(inputs)
x = self.dense2(x)
x = self.dense3(x)
return x
模型建立于:
sub_model = Pseudo_Model(name='sub_model')
inputs = Input(shape=(32,))
outputs = sub_model(inputs)
model = Model(inputs, outputs)
model.summary()
model.get_layer('sub_model').summary()
模型结构:
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 32)] 0
_________________________________________________________________
sub_model (Pseudo_Model) (None, 2) 607
=================================================================
Total params: 607
Trainable params: 607
Non-trainable params: 0
_________________________________________________________________
Model: "sub_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 16) 528
_________________________________________________________________
dense_1 (Dense) (None, 4) 68
_________________________________________________________________
dense_2 (Dense) (None, 2) 10
=================================================================
Total params: 607
Trainable params: 607
Non-trainable params: 0
_________________________________________________________________
然后按照@Srihari Humbarwadi 提到的那样定义损失函数,只添加一个新的可训练参数a:
def custom_loss(weight_a, weight_b, a):
def _custom_loss():
# This can include any arbitrary logic
loss = a * tf.norm(weight_a) + tf.norm(weight_b)
return loss
return _custom_loss
通过add_loss() API:
将损失添加到模型中 a_ = model.get_layer('sub_model').a
weighta = model.get_layer('sub_model').layers[0].kernel
weightb = model.get_layer('sub_model').layers[1].kernel
model.get_layer('sub_model').add_loss(custom_loss(weighta, weightb, a_))
print(model.losses)
#[<tf.Tensor: id=116, shape=(1,), dtype=float32, numpy=array([7.2659254], dtype=float32)>]
然后我创建一个假数据集来测试它:
fake_data = np.random.rand(1000, 32)
fake_labels = np.random.rand(1000, 2)
model.compile(optimizer=tf.keras.optimizers.SGD(), loss='mse')
model.fit(x=fake_data, y=fake_labels, epochs=5)
print(model.get_layer(name='sub_model').a)
如您所见,正在更新变量和损失:
Train on 1000 samples
Epoch 1/5
2020-06-19 19:21:02.475464: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
1000/1000 - 1s - loss: 3.9039
Epoch 2/5
1000/1000 - 0s - loss: -3.0905e+00
Epoch 3/5
1000/1000 - 0s - loss: -1.2103e+01
Epoch 4/5
1000/1000 - 0s - loss: -2.6855e+01
Epoch 5/5
1000/1000 - 0s - loss: -5.3408e+01
<tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([-8.13609], dtype=float32)>
Process finished with exit code 0
但是,这仍然是一个非常棘手的方法。不知道有没有更优雅稳定的方式来实现同样的功能