在 Tensorflow 中覆盖设备范围

Question

在以下情况下如何处理设备范围，其中外部设备范围被内部设备范围覆盖：

with tf.device("/cpu:0"):
    a = function1()

    with tf.device("/gpu:0"):
        b = function2()

    with tf.device("/gpu:1"):
        c = function3()

    d = a+b+c

我的直觉如下：

1) "a" 首先在 "cpu:0"

上计算

2) "b" 和 "c" 并分别在 "gpu:0" 和 "gpu:1" 上并行计算。

3) "d" 等待 "b" 和 "c" 因为它取决于它们，当它们的值可用时，"d" 在 [=24= 上计算]

我的直觉是正确的吗？

Answer 1

大部分情况下，有一些细微之处：

(a) "b" 和 "c" 可以并行计算，前提是它们之间没有控制流依赖或数据依赖正在做。但是从这个例子中无法预测它们是否真的是真正同时执行。（我认为这已经很明显了，但我想确保其他人可能会在以后阅读这篇文章。）

另请注意，如前所述，b 和 c 并未明确依赖于 a，因此它们三个可能会同时执行。并不是说a一定要先执行

(b) 默认情况下，如果您不提供任何配置选项，则设备放置是 "soft"——如果操作无法执行，运行时间可以覆盖具体设备。例如，可以将仅 CPU 的运算从 GPU 移回 /cpu:0；或者固定到 /gpu:1 的操作可以移动到 /gpu:0 如果图形是运行在只有一个 GPU 的机器上。

您可以通过向 tf.Session:

提供配置来控制硬与软放置

with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)):

Answer 2

是的。

PS，为了检验你的直觉，你可以这样做

with tf.device("/cpu:0"):
  a = tf.placeholder(dtype=tf.int32, name="a")
  with tf.device("/gpu:0"):
    b = tf.placeholder(dtype=tf.int32, name="b")
    with tf.device("/gpu:1"):
      c = tf.placeholder(dtype=tf.int32, name="c")
      d = a+b+c

print d.graph.as_graph_def()

这给出了 TensorFlow 系统将运行

的底层图形定义

在 Tensorflow 中覆盖设备范围

Overriding device scope in Tensorflow

tensorflow