计算方向梯度w.r.t。 TensorFlow 中的权重
Computing directional gradient w.r.t. the weights in TensorFlow
我想计算梯度 w.r.t。 tf 模型的权重,但只有一个方向:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=False))
features = tf.random.normal((1000,10))
labels = tf.random.normal((1000,))
model.fit(features, labels, batch_size=32, epochs=1)
x_star = model.layers[0].weights #the layer has kernel and bias
v = tf.random.normal((10,1)) #direction of the gradient
def directional_loss(model, x, y, t):
model.layers[0].set_weights([x_star[0] + t*v, x_star[1]])
y_ = model(x)
return model.loss(y_true=y, y_pred=y_)
def directional_grad(model, inputs, targets, t):
with tf.GradientTape() as tape:
loss_value = directional_loss(model, inputs, targets, t)
return loss_value, tape.gradient(loss_value, t)
t=0.
loss_value, grads = directional_grad(model, features, labels, t)
但是returns出现以下错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in directional_grad
File "C:\Users\pierr\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\backprop.py", line 1070, in gradient
if not backprop_util.IsTrainable(t):
File "C:\Users\pierr\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\backprop_util.py", line 58, in IsTrainable
dtype = dtypes.as_dtype(dtype)
File "C:\Users\pierr\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\dtypes.py", line 725, in as_dtype
raise TypeError(f"Cannot convert value {type_value!r} to a TensorFlow DType.")
TypeError: Cannot convert value 0.0 to a TensorFlow DType.
我认为是因为操作model.layers[0].set_weights
不是“可微”的。
我该如何解决?
或者,在 TensorFlow 中,我可以通过直接指定权重来计算层的输出,例如 y = layer(x, weights=w)
?
最后,除了 re-creating 来自 class tf.keras.layers.Layer 和 re-defining 其方法 build
和 call
,例如 dense
层:
class CustomLayer(tf.keras.layers.Layer):
def __init__(self, x_star, direction_vectors, activation=None):
super(CustomLayer, self).__init__()
self.x_star = x_star # x_star[0] is the kernel matrix and x_star[1] is the bias
self.direction_vectors = tf.reshape(direction_vectors, [direction_vectors.shape[0], x_star[0].shape[0], x_star[0].shape[1]])
self.activation = activation
def build(self, input_shape):
self.kernel = self.add_weight("kernel", shape = [direction_vectors.shape[0],])
def call(self, inputs):
outputs = tf.matmul(inputs, self.x_star[0] + tf.tensordot(self.kernel, self.direction_vectors, axes=[[0],[0]])) + self.x_star[1]
if self.activation is not None:
outputs = self.activation(outputs)
return outputs
如 https://github.com/Bras-P/gibbs-measures-with-singular-hessian/blob/main/T4-expansion.ipynb
中所述
我想计算梯度 w.r.t。 tf 模型的权重,但只有一个方向:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=False))
features = tf.random.normal((1000,10))
labels = tf.random.normal((1000,))
model.fit(features, labels, batch_size=32, epochs=1)
x_star = model.layers[0].weights #the layer has kernel and bias
v = tf.random.normal((10,1)) #direction of the gradient
def directional_loss(model, x, y, t):
model.layers[0].set_weights([x_star[0] + t*v, x_star[1]])
y_ = model(x)
return model.loss(y_true=y, y_pred=y_)
def directional_grad(model, inputs, targets, t):
with tf.GradientTape() as tape:
loss_value = directional_loss(model, inputs, targets, t)
return loss_value, tape.gradient(loss_value, t)
t=0.
loss_value, grads = directional_grad(model, features, labels, t)
但是returns出现以下错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in directional_grad
File "C:\Users\pierr\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\backprop.py", line 1070, in gradient
if not backprop_util.IsTrainable(t):
File "C:\Users\pierr\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\backprop_util.py", line 58, in IsTrainable
dtype = dtypes.as_dtype(dtype)
File "C:\Users\pierr\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\dtypes.py", line 725, in as_dtype
raise TypeError(f"Cannot convert value {type_value!r} to a TensorFlow DType.")
TypeError: Cannot convert value 0.0 to a TensorFlow DType.
我认为是因为操作model.layers[0].set_weights
不是“可微”的。
我该如何解决?
或者,在 TensorFlow 中,我可以通过直接指定权重来计算层的输出,例如 y = layer(x, weights=w)
?
最后,除了 re-creating 来自 class tf.keras.layers.Layer 和 re-defining 其方法 build
和 call
,例如 dense
层:
class CustomLayer(tf.keras.layers.Layer):
def __init__(self, x_star, direction_vectors, activation=None):
super(CustomLayer, self).__init__()
self.x_star = x_star # x_star[0] is the kernel matrix and x_star[1] is the bias
self.direction_vectors = tf.reshape(direction_vectors, [direction_vectors.shape[0], x_star[0].shape[0], x_star[0].shape[1]])
self.activation = activation
def build(self, input_shape):
self.kernel = self.add_weight("kernel", shape = [direction_vectors.shape[0],])
def call(self, inputs):
outputs = tf.matmul(inputs, self.x_star[0] + tf.tensordot(self.kernel, self.direction_vectors, axes=[[0],[0]])) + self.x_star[1]
if self.activation is not None:
outputs = self.activation(outputs)
return outputs
如 https://github.com/Bras-P/gibbs-measures-with-singular-hessian/blob/main/T4-expansion.ipynb
中所述