GradientTape 在循环中 运行 时返回 None

GradientTape returning None when run in a loop

以下梯度下降失败,因为当循环第二次运行时,tape.gradient() 返回的梯度是 none。

w = tf.Variable(tf.random.normal((3, 2)), name='w')
b = tf.Variable(tf.zeros(2, dtype=tf.float32), name='b')
x = tf.constant([[1., 2., 3.]])


for i in range(10):
  print("iter {}".format(i))
  with tf.GradientTape() as tape:
    #forward prop
    y = x @ w + b  
    loss = tf.reduce_mean(y**2)
    print("loss is \n{}".format(loss))
    print("output- y is \n{}".format(y))
    #vars getting dropped after couple of iterations
    print(tape.watched_variables()) 
  
  #get the gradients to minimize the loss
  dl_dw, dl_db = tape.gradient(loss,[w,b]) 

  #descend the gradients
  w = w.assign_sub(0.001*dl_dw)
  b = b.assign_sub(0.001*dl_db)
iter 0
loss is 
23.328645706176758
output- y is 
[[ 6.8125362  -0.49663293]]
(<tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
array([[-1.3461215 ,  0.43708783],
       [ 1.5931423 ,  0.31951016],
       [ 1.6574576 , -0.52424705]], dtype=float32)>, <tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>)
iter 1
loss is 
22.634033203125
output- y is 
[[ 6.7103477  -0.48918355]]
()

TypeError                                 Traceback (most recent call last)
c:\projects\pyspace\mltest\test.ipynb Cell 7' in <cell line: 1>()
     11 dl_dw, dl_db = tape.gradient(loss,[w,b]) 
     13 #descend the gradients
---> 14 w = w.assign_sub(0.001*dl_dw)
     15 b = b.assign_sub(0.001*dl_db)

TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'

我查看了解释渐变变为 None 的可能性的文档,但其中 none 有帮助。

这是因为assign_subreturns一个Tensor。因此,在 w = w.assign_sub(0.001*dl_dw) 行中,您将用具有新值的张量覆盖 w。因此,在下一步中,它不再是 Variable 并且默认情况下不会被渐变带跟踪。这导致梯度变为 None(张量也没有 assign_sub 方法,因此也会崩溃)。

相反,只需写 w.assign_sub(0.001*dl_dw)b 也一样。赋值函数就地工作,因此不需要赋值。