Tensorflow:图的不同路径之间tf.gradients
Tensorflow: tf.gradients between different paths of the graph
我正在研究 DDPG 实现,它需要计算一个网络(下图:critic
)相对于另一个网络(下图:actor
)输出的梯度。我的代码已经在大部分情况下使用了队列而不是 feed dicts,但是对于这个特定部分我还不能这样做:
import tensorflow as tf
tf.reset_default_graph()
states = tf.placeholder(tf.float32, (None,))
actions = tf.placeholder(tf.float32, (None,))
actor = states * 1
critic = states * 1 + actions
grads_indirect = tf.gradients(critic, actions)
grads_direct = tf.gradients(critic, actor)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
act = sess.run(actor, {states: [1.]})
print(act) # -> [1.]
cri = sess.run(critic, {states: [1.], actions: [2.]})
print(cri) # -> [3.]
grad1 = sess.run(grads_indirect, {states: [1.], actions: act})
print(grad1) # -> [[1.]]
grad2 = sess.run(grads_direct, {states: [1.], actions: [2.]})
print(grad2) # -> TypeError: Fetch argument has invalid type 'NoneType'
grad1
这里计算梯度 w.r.t。到馈入动作,这些动作之前由 actor
计算。 grad2
应该做同样的事情,但直接在图表内部而不需要反馈动作,而是直接评估 actor
。问题是 grads_direct
是 None
:
print(grads_direct) # [None]
我怎样才能做到这一点?是否有我可以使用的专用 "evaluate this tensor" 操作?谢谢!
在您的示例中,您没有使用 actor
来计算 critic
,因此梯度为 None。
你应该这样做:
actor = states * 1
critic = actor + actions # change here
grads_indirect = tf.gradients(critic, actions)
grads_direct = tf.gradients(critic, actor)
我正在研究 DDPG 实现,它需要计算一个网络(下图:critic
)相对于另一个网络(下图:actor
)输出的梯度。我的代码已经在大部分情况下使用了队列而不是 feed dicts,但是对于这个特定部分我还不能这样做:
import tensorflow as tf
tf.reset_default_graph()
states = tf.placeholder(tf.float32, (None,))
actions = tf.placeholder(tf.float32, (None,))
actor = states * 1
critic = states * 1 + actions
grads_indirect = tf.gradients(critic, actions)
grads_direct = tf.gradients(critic, actor)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
act = sess.run(actor, {states: [1.]})
print(act) # -> [1.]
cri = sess.run(critic, {states: [1.], actions: [2.]})
print(cri) # -> [3.]
grad1 = sess.run(grads_indirect, {states: [1.], actions: act})
print(grad1) # -> [[1.]]
grad2 = sess.run(grads_direct, {states: [1.], actions: [2.]})
print(grad2) # -> TypeError: Fetch argument has invalid type 'NoneType'
grad1
这里计算梯度 w.r.t。到馈入动作,这些动作之前由 actor
计算。 grad2
应该做同样的事情,但直接在图表内部而不需要反馈动作,而是直接评估 actor
。问题是 grads_direct
是 None
:
print(grads_direct) # [None]
我怎样才能做到这一点?是否有我可以使用的专用 "evaluate this tensor" 操作?谢谢!
在您的示例中,您没有使用 actor
来计算 critic
,因此梯度为 None。
你应该这样做:
actor = states * 1
critic = actor + actions # change here
grads_indirect = tf.gradients(critic, actions)
grads_direct = tf.gradients(critic, actor)