在tensorflow中要实现训练结果100%可复现，需要在哪里设置seeds？

Question

在像

这样的一般张量流设置中

model = construct_model()
with tf.Session() as sess:
    train_model(sess)

其中 construct_model() 包含模型定义，包括权重的随机初始化 (tf.truncated_normal) 并且 train_model(sess) 执行模型的训练 -

我必须在何处设置哪些种子才能确保重复运行上述代码片段之间的 100% 可再现性？ The documentation for tf.random.set_random_seed 可能很简洁，但让我有点困惑。我试过了：

tf.set_random_seed(1234)
model = construct_model()
    with tf.Session() as sess:
        train_model(sess)

但是每次都得到不同的结果。

Answer 1

一个可能的原因是在构建模型时，有一些代码使用了numpy.random模块。所以也许你也可以尝试为 numpy 设置种子。

Answer 2

目前适用于 GPU 的最佳解决方案是通过以下方式安装 tensorflow-determinism：

pip install tensorflow-determinism

然后将以下代码包含到您的代码中

import tensorflow as tf
import os
os.environ['TF_DETERMINISTIC_OPS'] = '1'

来源：https://github.com/NVIDIA/tensorflow-determinism

Answer 3

对我有用的是遵循 this answer 并进行一些修改：

import tensorflow as tf
import numpy as np
import random

# Setting seed value
# from 
# generated randomly by running `random.randint(0, 100)` once
SEED = 75
# 1. Set the `PYTHONHASHSEED` environment variable at a fixed value
os.environ['PYTHONHASHSEED'] = str(SEED)
# 2. Set the `python` built-in pseudo-random generator at a fixed value
random.seed(SEED)
# 3. Set the `numpy` pseudo-random generator at a fixed value
np.random.seed(SEED)
# 4. Set the `tensorflow` pseudo-random generator at a fixed value
tf.random.set_seed(SEED)

我不知道如何设置（第 5 步），但似乎没有必要。

我是运行 Google Colab Pro 在高 RAM TPU 上，我的训练结果（损失函数的图形）连续三次完全相同方法。

在tensorflow中要实现训练结果100%可复现，需要在哪里设置seeds？

Which seeds have to be set where to realize 100% reproducibility of training results in tensorflow?

python

random-seed

tensorflow