Select 来自张量流模型的动作权重
Select weight of action from a tensorflow model
我有一个用于强化学习环境的小模型。
我可以输入一个二维状态张量,我得到一个二维动作权重张量。
假设我输入了两个状态,我得到了以下动作权重:
[[0.1, 0.2],
[0.3, 0.4]]
现在我有另一个 2d 张量,它具有我想从中获取权重的动作编号:
[[1],
[0]]
如何使用这个张量来获取动作的权重?
在这个例子中我想得到:
[[0.2],
[0.3]]
与类似,这里对索引的处理略有不同:
a = tf.constant( [[0.1, 0.2], [0.3, 0.4]])
indices = tf.constant([[1],[0]])
# convert to full indices
full_indices = tf.stack([tf.range(indices.shape[0])[...,tf.newaxis], indices], axis=2)
# gather
result = tf.gather_nd(a,full_indices)
with tf.Session() as sess:
print(sess.run(result))
#[[0.2]
#[0.3]]
一个简单的方法是压缩索引的维度,逐元素乘以相应的单热向量,然后扩展维度。
import tensorflow as tf
weights = tf.constant([[0.1, 0.2], [0.3, 0.4]])
indices = tf.constant([[1], [0]])
# Reduce from 2d (2, 1) to 1d (2,)
indices1d = tf.squeeze(indices)
# One-hot vector corresponding to the indices. shape (2, 2)
action_one_hot = tf.one_hot(indices=indices1d, depth=weights.shape[1])
# Element-wise multiplication and sum across axis 1 to pick the weight. Shape (2,)
action_taken_weight = tf.reduce_sum(action_one_hot * weights, axis=1)
# Expand the dimension back to have a 2d. Shape (2, 1)
action_taken_weight2d = tf.expand_dims(action_taken_weight, axis=1)
sess = tf.InteractiveSession()
print("weights\n", sess.run(weights))
print("indices\n", sess.run(indices))
print("indices1d\n", sess.run(indices1d))
print("action_one_hot\n", sess.run(action_one_hot))
print("action_taken_weight\n", sess.run(action_taken_weight))
print("action_taken_weight2d\n", sess.run(action_taken_weight2d))
应该给你以下输出:
weights
[[0.1 0.2]
[0.3 0.4]]
indices
[[1]
[0]]
indices1d
[1 0]
action_one_hot
[[0. 1.]
[1. 0.]]
action_taken_weight
[0.2 0.3]
action_taken_weight2d
[[0.2]
[0.3]]
Note: You can also do action_taken_weight = tf.reshape(action_taken_weight, tf.shape(indices))
instead of expand_dims.
我有一个用于强化学习环境的小模型。
我可以输入一个二维状态张量,我得到一个二维动作权重张量。
假设我输入了两个状态,我得到了以下动作权重:
[[0.1, 0.2],
[0.3, 0.4]]
现在我有另一个 2d 张量,它具有我想从中获取权重的动作编号:
[[1],
[0]]
如何使用这个张量来获取动作的权重?
在这个例子中我想得到:
[[0.2],
[0.3]]
与
a = tf.constant( [[0.1, 0.2], [0.3, 0.4]])
indices = tf.constant([[1],[0]])
# convert to full indices
full_indices = tf.stack([tf.range(indices.shape[0])[...,tf.newaxis], indices], axis=2)
# gather
result = tf.gather_nd(a,full_indices)
with tf.Session() as sess:
print(sess.run(result))
#[[0.2]
#[0.3]]
一个简单的方法是压缩索引的维度,逐元素乘以相应的单热向量,然后扩展维度。
import tensorflow as tf
weights = tf.constant([[0.1, 0.2], [0.3, 0.4]])
indices = tf.constant([[1], [0]])
# Reduce from 2d (2, 1) to 1d (2,)
indices1d = tf.squeeze(indices)
# One-hot vector corresponding to the indices. shape (2, 2)
action_one_hot = tf.one_hot(indices=indices1d, depth=weights.shape[1])
# Element-wise multiplication and sum across axis 1 to pick the weight. Shape (2,)
action_taken_weight = tf.reduce_sum(action_one_hot * weights, axis=1)
# Expand the dimension back to have a 2d. Shape (2, 1)
action_taken_weight2d = tf.expand_dims(action_taken_weight, axis=1)
sess = tf.InteractiveSession()
print("weights\n", sess.run(weights))
print("indices\n", sess.run(indices))
print("indices1d\n", sess.run(indices1d))
print("action_one_hot\n", sess.run(action_one_hot))
print("action_taken_weight\n", sess.run(action_taken_weight))
print("action_taken_weight2d\n", sess.run(action_taken_weight2d))
应该给你以下输出:
weights
[[0.1 0.2]
[0.3 0.4]]
indices
[[1]
[0]]
indices1d
[1 0]
action_one_hot
[[0. 1.]
[1. 0.]]
action_taken_weight
[0.2 0.3]
action_taken_weight2d
[[0.2]
[0.3]]
Note: You can also do action_taken_weight =
tf.reshape(action_taken_weight, tf.shape(indices))
instead of expand_dims.