tf.estimator 随机播放 - 随机种子？

Question

当我重复运行 tf.estimator.LinearRegressor时，每次的结果都略有不同。我猜这是因为 shuffle=True 这里：

input_fn = tf.estimator.inputs.numpy_input_fn(
    {"x": x_train}, y_train, batch_size=4, num_epochs=None, shuffle=True)

就目前而言这很好，但是当我尝试通过在 np 和 tf 中播种随机数生成器来使其具有确定性时：

np.random.seed(1)
tf.set_random_seed(1)

每次的结果还是略有不同。我错过了什么？

Answer 1

tf.set_random_seed 设置 graph-level 种子，但它不是随机性的唯一来源，因为还有一个 operation-level seed，每个op都需要提供。

不幸的是，tf.estimator.inputs.numpy_input_fn does not provide the seed argument along with shuffle to pass them to the underlying ops (source code)。结果，_enqueue_data 函数总是得到 seed=None，这将重置您提前设置的任何种子。顺便说一句，值得注意的是，许多底层提要函数使用标准 python random.seed 进行随机播放，而不是 tensorflow 随机（参见 _ArrayFeedFn、_OrderedDictNumpyFeedFn 等）。

总结：目前无法保证 shuffle=True 的稳定执行，至少目前的 API 是这样。您唯一的选择是自己打乱数据并传递 shuffle=False.

tf.estimator 随机播放 - 随机种子？

tf.estimator shuffle - random seed?

python

random

deep-learning

tensorflow

tensorflow-estimator