BUFFER_SIZE 在 Tensorflow 数据集改组中做了什么?

What does BUFFER_SIZE do in Tensorflow Dataset shuffling?

所以我一直在研究这段代码:https://www.tensorflow.org/tutorials/generative/dcgan 并且几乎对它的功能有了一个很好的了解。但是,我不太清楚 BUFFER_SIZE 变量的用途是什么。我怀疑它可能用于创建大小为 BUFFER_SIZE 的数据库子集,然后从该子集中获取批次,但我看不到重点它也找不到人解释它。

所以,如果有人能向我解释一下 BUFFER_SIZE 的作用,我将不胜感激 ❤

它被用作 tf.data.Dataset.shuffle 中的 buffer_size 参数。你读过docs吗?

This dataset fills a buffer with buffer_size elements, then randomly samples elements from this buffer, replacing the selected elements with new elements. For perfect shuffling, a buffer size greater than or equal to the full size of the dataset is required.

For instance, if your dataset contains 10,000 elements but buffer_size is set to 1,000, then shuffle will initially select a random element from only the first 1,000 elements in the buffer. Once an element is selected, its space in the buffer is replaced by the next (i.e. 1,001-st) element, maintaining the 1,000 element buffer.

在 TensorFlow 的文档中,buffer_size 定义了一个大小介于 buffer_size 之间的随机第一个元素。选择这个随机数字后,接下来的数字将按照buffer_size

的大小

样本 = 1000
buffer_size = 100

在 (0, 100) 之间随机选择一个
随机 = 37
样本将是(37 到 137)