使用多层权重的正则化函数?
Regularization function using weights from multiple layers?
我不知道这是否可行,但我问一下以防万一。这是我模型的(简化)架构。
Layer (type) Output Shape Param #Connected to
==========================================
input_1 (InputLayer) [(None, 7, 7, 1024) 0
conv (Conv2D) (None, 7, 7, 10) 10240 input_1[0][0]
其中“conv”中的 10 个过滤器中的每一个都是 1x1x1024 卷积过滤器(没有偏差,但与此特定问题无关)。
我目前在“conv”上使用自定义正则化函数来确保滤波器权重的 (1x1)x1024x10 矩阵具有良好的 属性(基本上所有向量都是成对正交的),到目前为止,一切正常预期的。
现在,我还希望能够 禁用对这 10 个过滤器中的某些过滤器的训练 。我知道如何做到这一点的唯一方法是独立实现 10 个过滤器,如下所示
Layer (type) Output Shape Param # Connected to
=========================================================
input_1 (InputLayer) [(None, 7, 7, 1024) 0
conv_1 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
conv_2 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
conv_3 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
...
conv_10 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
之后是连接层,然后在我认为合适的每个 conv_i 层上将“可训练”参数设置为 True/False。但是,现在我不知道如何实现我的正则化函数,它必须同时计算所有层 conv_i 的权重而不是独立计算。
有什么技巧可以用来实现这样的功能吗?或者反过来,有没有办法只冻结卷积层的部分权重?
谢谢!
解决方案
对于那些感兴趣的人,这里是按照@LaplaceRicky 提供的建议解决我的问题的工作代码。
class SpecialRegularization(tf.keras.Model):
""" In order to avoid a warning message when saving the model,
I use the solution indicated here
https://github.com/tensorflow/tensorflow/issues/44541
and now inherit from tf.keras.Model instead of Layer
"""
def __init__(self,nfilters,**kwargs):
super().__init__(**kwargs)
self.inner_layers=[Conv2D(1,(1,1)) for _ in range(nfilters)]
def call(self, inputs):
outputs=[l(inputs) for l in self.inner_layers]
self.add_loss(self.define_your_regularization_here())
return tf.concat(outputs,-1)
def set_trainable_parts(self, trainables):
""" Set the trainable attribute independently on each filter """
for l,t in zip(self.inner_layers,trainables):
l.trainable = t
def define_your_regularization_here(self):
#reconstruct the original kernel
large_kernel=tf.concat([l.kernel for l in self.inner_layers],-1)
return tf.reduce_sum(large_kernel*large_kernel[:,:,:,::-1])
实现这一点的一种方法是拥有一个自定义的 keras 层,它包装所有小的 conv 层并负责计算正则化损失。
示例代码:
import tensorflow as tf
def _get_losses(model,x):
model(x)
return model.losses
def _get_grads(model,x):
with tf.GradientTape() as t:
model(x)
reg_loss=tf.math.add_n(model.losses)
return t.gradient(reg_loss,model.trainable_weights)
class SpecialRegularization(tf.keras.layers.Layer):
def __init__(self, **kwargs):
self.inner_layers=[tf.keras.layers.Conv2D(1,(1,1)) for i in range(10)]
super().__init__(**kwargs)
def call(self, inputs,training=None):
outputs=[l(inputs,training=training) for l in self.inner_layers]
self.add_loss(self.define_your_regularization_here())
return tf.concat(outputs,-1)
def define_your_regularization_here(self):
#reconstruct the original kernel
large_kernel=tf.concat([l.kernel for l in self.inner_layers],-1)
#just giving an example here
#you should define your own regularization using the entire kernel
return tf.reduce_sum(large_kernel*large_kernel[:,:,:,::-1])
tf.random.set_seed(123)
inputs = tf.keras.Input(shape=(7,7,1024))
outputs = SpecialRegularization()(inputs)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
#get_losses, get_grads are for demonstration purpose
get_losses=tf.function(_get_losses)
get_grads=tf.function(_get_grads)
data=tf.random.normal((64,7,7,1024))
print(get_losses(model,data))
print(get_grads(model,data)[0])
print(model.layers[1].inner_layers[-1].kernel*2)
model.summary()
'''
[<tf.Tensor: shape=(), dtype=float32, numpy=-0.20446025>]
tf.Tensor(
[[[[ 0.02072023]
[ 0.12973154]
[ 0.11631528]
...
[ 0.00804012]
[-0.07299817]
[ 0.06031524]]]], shape=(1, 1, 1024, 1), dtype=float32)
tf.Tensor(
[[[[ 0.02072023]
[ 0.12973154]
[ 0.11631528]
...
[ 0.00804012]
[-0.07299817]
[ 0.06031524]]]], shape=(1, 1, 1024, 1), dtype=float32)
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 7, 7, 1024)] 0
_________________________________________________________________
special_regularization (Spec (None, 7, 7, 10) 10250
=================================================================
Total params: 10,250
Trainable params: 10,250
Non-trainable params: 0
_________________________________________________________________
'''
我不知道这是否可行,但我问一下以防万一。这是我模型的(简化)架构。
Layer (type) Output Shape Param #Connected to
==========================================
input_1 (InputLayer) [(None, 7, 7, 1024) 0
conv (Conv2D) (None, 7, 7, 10) 10240 input_1[0][0]
其中“conv”中的 10 个过滤器中的每一个都是 1x1x1024 卷积过滤器(没有偏差,但与此特定问题无关)。 我目前在“conv”上使用自定义正则化函数来确保滤波器权重的 (1x1)x1024x10 矩阵具有良好的 属性(基本上所有向量都是成对正交的),到目前为止,一切正常预期的。 现在,我还希望能够 禁用对这 10 个过滤器中的某些过滤器的训练 。我知道如何做到这一点的唯一方法是独立实现 10 个过滤器,如下所示
Layer (type) Output Shape Param # Connected to
=========================================================
input_1 (InputLayer) [(None, 7, 7, 1024) 0
conv_1 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
conv_2 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
conv_3 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
...
conv_10 (Conv2D) (None, 7, 7, 1) 1024 input_1[0][0]
之后是连接层,然后在我认为合适的每个 conv_i 层上将“可训练”参数设置为 True/False。但是,现在我不知道如何实现我的正则化函数,它必须同时计算所有层 conv_i 的权重而不是独立计算。 有什么技巧可以用来实现这样的功能吗?或者反过来,有没有办法只冻结卷积层的部分权重? 谢谢!
解决方案
对于那些感兴趣的人,这里是按照@LaplaceRicky 提供的建议解决我的问题的工作代码。
class SpecialRegularization(tf.keras.Model):
""" In order to avoid a warning message when saving the model,
I use the solution indicated here
https://github.com/tensorflow/tensorflow/issues/44541
and now inherit from tf.keras.Model instead of Layer
"""
def __init__(self,nfilters,**kwargs):
super().__init__(**kwargs)
self.inner_layers=[Conv2D(1,(1,1)) for _ in range(nfilters)]
def call(self, inputs):
outputs=[l(inputs) for l in self.inner_layers]
self.add_loss(self.define_your_regularization_here())
return tf.concat(outputs,-1)
def set_trainable_parts(self, trainables):
""" Set the trainable attribute independently on each filter """
for l,t in zip(self.inner_layers,trainables):
l.trainable = t
def define_your_regularization_here(self):
#reconstruct the original kernel
large_kernel=tf.concat([l.kernel for l in self.inner_layers],-1)
return tf.reduce_sum(large_kernel*large_kernel[:,:,:,::-1])
实现这一点的一种方法是拥有一个自定义的 keras 层,它包装所有小的 conv 层并负责计算正则化损失。
示例代码:
import tensorflow as tf
def _get_losses(model,x):
model(x)
return model.losses
def _get_grads(model,x):
with tf.GradientTape() as t:
model(x)
reg_loss=tf.math.add_n(model.losses)
return t.gradient(reg_loss,model.trainable_weights)
class SpecialRegularization(tf.keras.layers.Layer):
def __init__(self, **kwargs):
self.inner_layers=[tf.keras.layers.Conv2D(1,(1,1)) for i in range(10)]
super().__init__(**kwargs)
def call(self, inputs,training=None):
outputs=[l(inputs,training=training) for l in self.inner_layers]
self.add_loss(self.define_your_regularization_here())
return tf.concat(outputs,-1)
def define_your_regularization_here(self):
#reconstruct the original kernel
large_kernel=tf.concat([l.kernel for l in self.inner_layers],-1)
#just giving an example here
#you should define your own regularization using the entire kernel
return tf.reduce_sum(large_kernel*large_kernel[:,:,:,::-1])
tf.random.set_seed(123)
inputs = tf.keras.Input(shape=(7,7,1024))
outputs = SpecialRegularization()(inputs)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
#get_losses, get_grads are for demonstration purpose
get_losses=tf.function(_get_losses)
get_grads=tf.function(_get_grads)
data=tf.random.normal((64,7,7,1024))
print(get_losses(model,data))
print(get_grads(model,data)[0])
print(model.layers[1].inner_layers[-1].kernel*2)
model.summary()
'''
[<tf.Tensor: shape=(), dtype=float32, numpy=-0.20446025>]
tf.Tensor(
[[[[ 0.02072023]
[ 0.12973154]
[ 0.11631528]
...
[ 0.00804012]
[-0.07299817]
[ 0.06031524]]]], shape=(1, 1, 1024, 1), dtype=float32)
tf.Tensor(
[[[[ 0.02072023]
[ 0.12973154]
[ 0.11631528]
...
[ 0.00804012]
[-0.07299817]
[ 0.06031524]]]], shape=(1, 1, 1024, 1), dtype=float32)
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 7, 7, 1024)] 0
_________________________________________________________________
special_regularization (Spec (None, 7, 7, 10) 10250
=================================================================
Total params: 10,250
Trainable params: 10,250
Non-trainable params: 0
_________________________________________________________________
'''