在多种设备上训练

train on multiple devices

我知道 TensorFlow 提供分布式训练 API 可以在多个设备上训练，例如多个 GPU、CPU、TPU 或多台计算机（worker）遵循此文档：https://www.tensorflow.org/tutorials/distribute/multi_worker_with_keras

但我有一个问题，是否有任何可能的方法来使用数据并行来拆分训练以跨多台机器（包括移动设备和计算机设备）进行训练？

如有tutorial/instruction.

将不胜感激

据我所知，Tensorflow 只支持 CPU、TPU 和 GPU 进行分布式训练，考虑到所有设备都应该在同一个网络中。

连接多台设备，如你所说，可以按照Multi-worker training。

tf.distribute.Strategy 集成到 tf.keras，因此当 model.fit 与 tf.distribute.Strategy 实例一起使用，然后为您的模型使用 strategy.scope() 时，可以创建分布式variables.This 允许它在您的设备上平均分配您的输入数据。您可以按照 this 教程了解更多详情。
另外 Distributed input 可以帮到你。

在多种设备上训练

train on multiple devices

machine-learning

tensorflow

distributed-training