是否可以在张量流中跨多个 GPU 拆分网络？

Question

我打算运行一个非常大的循环网络（例如2048x5），是否可以在tensorflow中在一个GPU上定义一层？我应该如何实现模型才能达到最佳效率。我知道 GPU 间或 GPU-CPU-GPU 通信有开销。

Answer 1

在 TensorFlow 中将大型模型拆分到多个 GPU 上当然是可能的，但以最佳方式做到这一点是一个艰巨的研究问题。通常，您需要执行以下操作：

将代码的大块连续区域包裹在一个 with tf.device(...): 块中，命名不同的 GPU：

with tf.device("/gpu:0"):
  # Define first layer.

with tf.device("/gpu:1"):
  # Define second layer.

# Define other layers, etc.

构建优化器时，将可选参数 colocate_gradients_with_ops=True 传递给 optimizer.minimize() 方法：

loss = ...
optimizer = tf.train.AdaGradOptimizer(0.01)
train_op = optimizer.minimize(loss, colocate_gradients_with_ops=True)

（可选。）您可能需要在创建 tf.Session 时在 tf.ConfigProto 中启用 "soft placement"，如果您的模型中的任何操作无法运行在 GPU 上：
```
config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)
```

Is it possible to split a network across multiple GPUs in tensorflow?