计算 tf.while_loop 每个时间步长的梯度

Question

给定一个 TensorFlow tf.while_loop，我如何计算每个时间步关于网络所有权重的 x_out 的梯度？

network_input = tf.placeholder(tf.float32, [None])
steps = tf.constant(0.0)

weight_0 = tf.Variable(1.0)
layer_1 = network_input * weight_0

def condition(steps, x):
    return steps <= 5

def loop(steps, x_in):
    weight_1 = tf.Variable(1.0)
    x_out = x_in * weight_1
    steps += 1
    return [steps, x_out]

_, x_final = tf.while_loop(
    condition,
    loop,
    [steps, layer_1]
)

一些笔记

在我的网络中，条件是动态的。不同的运行会运行 while 循环不同的次数。
调用 tf.gradients(x, tf.trainable_variables()) 与 AttributeError: 'WhileContext' object has no attribute 'pred' 崩溃。似乎在循环中使用 tf.gradients 的唯一可能性是计算相对于 weight_1 的梯度和 x_in 的当前值/时间步长，而不通过时间反向传播。
在每个时间步中，网络将输出动作的概率分布。然后需要梯度来实现策略梯度。

Answer 1

你永远不能在基于 this and this 的 Tensorflow 中的 tf.while_loop 中调用 tf.gradients，我在尝试将共轭梯度下降完全创建到Tensorflow 图。

但如果我正确理解你的模型，你可以制作你自己的 RNNCell 版本并将其包装在 tf.dynamic_rnn 中，但实际的单元格实施会有点复杂，因为您需要在运行时动态评估条件。

初学者可以看看Tensorflow的dynamic_rnn代码here.

或者，动态图从来都不是 Tensorflow 的强项，因此请考虑使用其他框架，例如 PyTorch 或者您可以尝试 eager_execution 看看是否有帮助。

计算 tf.while_loop 每个时间步长的梯度

Compute gradients for each time step of tf.while_loop

python

while-loop

backpropagation

tensorflow