在导致性能问题之前，TensorFlow .record 文件可以有多大？

How large can a TensorFlow .record file be before it causes performance issues?

在 TensorFlow 对象检测 API 中，如果数据集包含 "more than a few thousand examples"，noting that：

，他们提倡分片

tf.data.Dataset API 可以并行读取输入示例，提高吞吐量。
tf.data.Dataset API 可以使用分片文件更好地打乱示例，从而稍微提高模型的性能。

几千有点模糊，如果有更准确的答案就好了，比如文件大小。换句话说，.record 文件在开始导致性能问题之前可以有多大？分片数据时，我们的目标文件大小应该是多少？

TensorFlow 团队似乎建议使用 ~100MB 的分片。 https://www.tensorflow.org/guide/performance/overview You might also consider the performance implications related to batch size while training. https://www.pugetsystems.com/labs/hpc/GPU-Memory-Size-and-Deep-Learning-Performance-batch-size-12GB-vs-32GB----1080Ti-vs-Titan-V-vs-GV100-1146/