用tensorflow梯度带计算Hessian
Calculating Hessian with tensorflow gradient tape
感谢您对此问题的关注。
我想计算 tensorflow.keras.Model
的 hessian 矩阵
对于高阶导数,我尝试了嵌套的 GradientTape。# 示例图和输入
xs = tf.constant(tf.random.normal([100,24]))
ex_model = Sequential()
ex_model.add(Input(shape=(24)))
ex_model.add(Dense(10))
ex_model.add(Dense(1))
with tf.GradientTape(persistent=True) as tape:
tape.watch(xs)
ys = ex_model(xs)
g = tape.gradient(ys, xs)
h = tape.jacobian(g, xs)
print(g.shape)
print(h.shape)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-20-dbf443f1ddab> in <module>
5 h = tape.jacobian(g, xs)
6 print(g.shape)
----> 7 print(h.shape)
AttributeError: 'NoneType' object has no attribute 'shape'
还有,另一个试验...
with tf.GradientTape() as tape1:
with tf.GradientTape() as tape2:
tape2.watch(xs)
ys = ex_model(xs)
g = tape2.gradient(ys, xs)
h = tape1.jacobian(g, xs)
print(g.shape)
print(h.shape)
(100, 24)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-17-c5bbb17404bc> in <module>
7
8 print(g.shape)
----> 9 print(h.shape)
AttributeError: 'NoneType' object has no attribute 'shape'
为什么我不能计算 g wrt x 的梯度?
您已经计算了 ys
梯度的二阶 wrt xs
,它是零,当您计算梯度 wrt 常量时它应该是零,这就是为什么 tape1.jacobian(g, xs)
returnNone
当梯度的二阶不是 wrt常数时:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense
x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
with tf.GradientTape() as t1:
y = w * x**3
dy_dx = t1.gradient(y, x)
d2y_dx2 = t2.gradient(dy_dx, x)
print('dy_dx:', dy_dx) # 3 * 3 * x**2 => 9.0
print('d2y_dx2:', d2y_dx2) # 9 * 2 * x => 18.0
输出:
dy_dx: tf.Tensor(9.0, shape=(), dtype=float32)
d2y_dx2: tf.Tensor(18.0, shape=(), dtype=float32)
当梯度的二阶 是 wrt 常量时:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense
x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
with tf.GradientTape() as t1:
y = w * x
dy_dx = t1.gradient(y, x)
d2y_dx2 = t2.gradient(dy_dx, x)
print('dy_dx:', dy_dx)
print('d2y_dx2:', d2y_dx2)
输出:
dy_dx: tf.Tensor(3.0, shape=(), dtype=float32)
d2y_dx2: None
然而,您可以计算层参数二阶梯度 wrt xs
,例如 Input gradient regularization
感谢您对此问题的关注。
我想计算 tensorflow.keras.Model
的 hessian 矩阵对于高阶导数,我尝试了嵌套的 GradientTape。# 示例图和输入
xs = tf.constant(tf.random.normal([100,24]))
ex_model = Sequential()
ex_model.add(Input(shape=(24)))
ex_model.add(Dense(10))
ex_model.add(Dense(1))
with tf.GradientTape(persistent=True) as tape:
tape.watch(xs)
ys = ex_model(xs)
g = tape.gradient(ys, xs)
h = tape.jacobian(g, xs)
print(g.shape)
print(h.shape)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-20-dbf443f1ddab> in <module>
5 h = tape.jacobian(g, xs)
6 print(g.shape)
----> 7 print(h.shape)
AttributeError: 'NoneType' object has no attribute 'shape'
还有,另一个试验...
with tf.GradientTape() as tape1:
with tf.GradientTape() as tape2:
tape2.watch(xs)
ys = ex_model(xs)
g = tape2.gradient(ys, xs)
h = tape1.jacobian(g, xs)
print(g.shape)
print(h.shape)
(100, 24)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-17-c5bbb17404bc> in <module>
7
8 print(g.shape)
----> 9 print(h.shape)
AttributeError: 'NoneType' object has no attribute 'shape'
为什么我不能计算 g wrt x 的梯度?
您已经计算了 ys
梯度的二阶 wrt xs
,它是零,当您计算梯度 wrt 常量时它应该是零,这就是为什么 tape1.jacobian(g, xs)
returnNone
当梯度的二阶不是 wrt常数时:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense
x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
with tf.GradientTape() as t1:
y = w * x**3
dy_dx = t1.gradient(y, x)
d2y_dx2 = t2.gradient(dy_dx, x)
print('dy_dx:', dy_dx) # 3 * 3 * x**2 => 9.0
print('d2y_dx2:', d2y_dx2) # 9 * 2 * x => 18.0
输出:
dy_dx: tf.Tensor(9.0, shape=(), dtype=float32)
d2y_dx2: tf.Tensor(18.0, shape=(), dtype=float32)
当梯度的二阶 是 wrt 常量时:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense
x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
with tf.GradientTape() as t1:
y = w * x
dy_dx = t1.gradient(y, x)
d2y_dx2 = t2.gradient(dy_dx, x)
print('dy_dx:', dy_dx)
print('d2y_dx2:', d2y_dx2)
输出:
dy_dx: tf.Tensor(3.0, shape=(), dtype=float32)
d2y_dx2: None
然而,您可以计算层参数二阶梯度 wrt xs
,例如 Input gradient regularization