tf.gradients 如何在 TensorFlow 中工作

Question

鉴于我有如下线性模型，我想获得关于 W 和 b 的梯度向量。

# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a linear model
pred = tf.add(tf.mul(X, W), b)

# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)

但是，如果我尝试这样的事情，其中成本是 cost(x,y,w,b) 的函数并且我只想相对于 w and b:

进行渐变

grads = tf.gradients(cost, tf.all_variable())

我的占位符也将包括在内（X 和 Y）。即使我确实得到了带有 [x,y,w,b] 的渐变，我怎么知道渐变中的哪个元素属于每个参数，因为它只是一个没有名称的列表，导数是针对哪个参数采用的？

在这个问题中，我使用了这个 code and I build on this 问题的一部分。

Answer 1

引用 tf.gradients

的文档

Constructs symbolic partial derivatives of sum of ys w.r.t. x in xs.

所以，这应该有效：

dc_dw, dc_db = tf.gradients(cost, [W, b])

此处，tf.gradients() returns cost 的梯度 wrt 第二个参数中的每个张量作为相同顺序的列表。

阅读 tf.gradients 了解更多信息。

tf.gradients 如何在 TensorFlow 中工作

How tf.gradients work in TensorFlow

machine-learning

linear-gradients

tensorflow