使用 GPU 对服务 TensorFlow 模型的性能有何影响？

What's the impact of using a GPU in the performance of serving a TensorFlow model?

我使用 GPU (1080 ti) 训练了一个神经网络。在 GPU 上的训练速度远好于使用 CPU.

目前，我想使用 TensorFlow Serving 为这个模型提供服务。我只是想知道在服务过程中使用 GPU 是否对性能有同样的影响？

由于训练适用于批处理，但推理（服务）使用异步请求，您是否建议在使用 TensorFlow 服务的模型中使用 GPU？

简短的回答是肯定的，训练后您将在 GPU 上获得与运行大致相同的加速。有一些小资格。

你运行 2 在训练中传递数据，这一切都发生在 GPU 上，在前馈推理期间你做的工作更少，因此将更多时间花在传输数据到GPU 内存相对于计算而不是训练。不过，这可能是一个微小的差异。如果这是一个问题，您现在可以异步加载 GPU (https://github.com/tensorflow/tensorflow/issues/7679)。

您是否真的需要 GPU 来进行推理取决于您的工作量。如果您的工作负载要求不高，那么无论如何您都可以使用 CPU，毕竟每个样本的计算工作负载不到一半，因此请考虑您需要处理的每秒请求数并测试您是否超载了 CPU 以实现该目标。如果这样做，是时候取出 GPU 了！

你仍然需要在图上做很多张量运算来预测一些东西。所以 GPU 仍然为推理提供性能提升。看看这个 nvidia paper，他们没有在 TF 上测试他们的东西，但它仍然是相关的：

Our results show that GPUs provide state-of-the-art inference performance and energy efficiency, making them the platform of choice for anyone wanting to deploy a trained neural network in the field. In particular, the Titan X delivers between 5.3 and 6.7 times higher performance than the 16-core Xeon E5 CPU while achieving 3.6 to 4.4 times higher energy efficiency.

使用 GPU 对服务 TensorFlow 模型的性能有何影响？

What's the impact of using a GPU in the performance of serving a TensorFlow model?

tensorflow

tensorflow-serving

tensorflow-gpu