为什么 Tensorflow 无法计算重塑参数的梯度？

Question

我想计算所有网络参数的损失梯度。当我尝试重塑每个权重矩阵以使其成为一维时，问题就出现了（这对我稍后使用梯度进行的计算很有用）。

此时 Tensorflow 输出一个 None 的列表（这意味着没有从损失到这些张量的路径，而应该有，因为它们是重塑的模型参数）。

代码如下：

all_tensors = list()
for dir in ["fw", "bw"]:
    for mtype in ["kernel"]:
        t = tf.get_default_graph().get_tensor_by_name("encoder/bidirectional_rnn/%s/lstm_cell/%s:0" % (dir, mtype))
        all_tensors.append(t)
        # classifier tensors:
    for mtype in ["kernel", "bias"]:
        t = tf.get_default_graph().get_tensor_by_name("encoder/dense/%s:0" % (mtype))
        all_tensors.append(t)
all_tensors = [tf.reshape(x, [-1]) for x in all_tensors]
tf.gradients(self.loss, all_tensors)

all_tensor 在 for 循环的末尾是一个包含 4 个组件的列表，这些组件具有不同形状的矩阵。此代码输出 [None, None, None, None]。如果我删除重塑线 all_tensors = [tf.reshape(x, [-1]) for x in all_tensors] 代码工作正常，returns 4 个张量包含每个参数的梯度。

为什么会这样？我很确定 reshape 不会破坏图中的任何依赖关系，否则它根本无法在任何网络中使用。

Answer 1

好吧，事实是没有从张量到损失的路径。如果您想到 TensorFlow 中的计算图，self.loss 是通过一系列操作定义的，这些操作有时会使用您感兴趣的张量。但是，当您这样做时：

all_tensors = [tf.reshape(x, [-1]) for x in all_tensors]

您正在图中创建新节点和未被任何人使用的新张量。是的，那些张量和损失值之间是有关系的，但是从TensorFlow的角度来看，reshaping是一个独立的计算。

如果你想做类似的事情，你必须先进行整形，然后使用整形后的张量计算损失。或者，您也可以只计算相对于原始张量的梯度，然后重塑结果。

为什么 Tensorflow 无法计算重塑参数的梯度？

Why Tensorflow is unable to compute the gradient wrt the reshaped parameters?

python

gradient

deep-learning

tensorflow