使用 GradientTape() 计算偏差项的梯度

Question

我想分别计算关于权重变量和偏置项的梯度张量。权重变量的梯度计算正确，但偏置梯度计算得不好。请让我知道问题是什么，或者正确修改我的代码。

import numpy as np
import tensorflow as tf

X =tf.constant([[1.0,0.1,-1.0],[2.0,0.2,-2.0],[3.0,0.3,-3.0],[4.0,0.4,-4.0],[5.0,0.5,-5.0]])
b1 = tf.Variable(-0.5)
Bb = tf.constant([ [1.0], [1.0], [1.0], [1.0], [1.0] ]) 
Bb = b1* Bb

Y0 = tf.constant([ [-10.0], [-5.0], [0.0], [5.0], [10.0] ])

W = tf.Variable([ [1.0], [1.0], [1.0] ])

with tf.GradientTape() as tape: 
    Y = tf.matmul(X, W) + Bb
    print("Y : ", Y.numpy())

    loss_val = tf.reduce_sum(tf.square(Y - Y0))  
    print("loss : ", loss_val.numpy())

gw = tape.gradient(loss_val, W)   # gradient calculation works well 
gb = tape.gradient(loss_val, b1)  # does NOT work

print("gradient W : ", gw.numpy())
print("gradient b : ", gb.numpy())

Answer 1

两件事。首先，如果你在这里查看文档 -

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape#args

你会看到你只能对 gradient 进行一次调用，除非 persistent=True

其次，您在磁带的上下文管理器之外设置 Bb = b1* Bb，因此不会记录此操作。

import numpy as np
import tensorflow as tf

X =tf.constant([[1.0,0.1,-1.0],[2.0,0.2,-2.0],[3.0,0.3,-3.0],[4.0,0.4,-4.0],[5.0,0.5,-5.0]])
b1 = tf.Variable(-0.5)
Bb = tf.constant([ [1.0], [1.0], [1.0], [1.0], [1.0] ]) 


Y0 = tf.constant([ [-10.0], [-5.0], [0.0], [5.0], [10.0] ])

W = tf.Variable([ [1.0], [1.0], [1.0] ])

with tf.GradientTape(persistent=True) as tape: 
    Bb = b1* Bb
    Y = tf.matmul(X, W) + Bb
    print("Y : ", Y.numpy())

    loss_val = tf.reduce_sum(tf.square(Y - Y0))  
    print("loss : ", loss_val.numpy())

gw = tape.gradient(loss_val, W)   # gradient calculation works well 
gb = tape.gradient(loss_val, b1)  # does NOT work

print("gradient W : ", gw.numpy())
print("gradient b : ", gb.numpy())

使用 GradientTape() 计算偏差项的梯度

gradient calculation for bias term using GradientTape()

tensorflow2.0