如何将 hparams 与估算器一起使用?

How do I use hparams with estimators?

记录 hparams without using Keras, I'm doing the following as suggested in the tf code here:

with tf.summary.create_file_writer(model_dir).as_default():
    hp_learning_rate = hp.HParam("learning_rate", hp.RealInterval(0.00001, 0.1))
    hp_distance_margin = hp.HParam("distance_margin", hp.RealInterval(0.1, 1.0))
    hparams_list = [
        hp_learning_rate,
        hp_distance_margin
    ]
    metrics_to_monitor = [
        hp.Metric("metrics_standalone/auc", group="validation"),
        hp.Metric("loss", group="train", display_name="training loss"),
    ]
    hp.hparams_config(hparams=hparams_list, metrics=metrics_to_monitor)
    hparams = {
        hp_learning_rate: params.learning_rate,
        hp_distance_margin: params.distance_margin,
    }
    hp.hparams(hparams)

请注意,params 是一个字典对象,我将传递给估算器。

然后我像往常一样训练估计器,

config = tf.estimator.RunConfig(model_dir=params.model_dir)
estimator = tf.estimator.Estimator(model_fn, params=params, config=config)
train_spec = tf.estimator.TrainSpec(...)
eval_spec = tf.estimator.EvalSpec(...)

tf.estimator.train_and_evaluate(estimator, train_spec=train_spec, eval_spec=eval_spec)

训练后,当我启动 tensorboard 时,我确实记录了 hparams,但我没有看到针对它们记录的任何指标

我进一步确认它们出现在 scalars 页面中,训练和验证的标签名称相同,即 ../eval,但 hparams 页面没有查看那些记录的张量。

如何将 hparams 与估算器一起使用?


我正在使用

tensorboard              2.1.0
tensorflow               2.1.0
tensorflow-estimator     2.1.0
tensorflow-metadata      0.15.2

Python 3.7.5


尝试 1:

谷歌搜索后,我看到了一些较旧的 tf 代码,它们将 hparams 传递给 Estimator 的 params 参数,所以只是为了确保 tf2 在给定时是否自行记录这些 hparams,我检查了Estimator 文档,上面写着:

The params argument contains hyperparameters. It is passed to the model_fn, if the model_fn has a parameter named "params", and to the input functions in the same manner. Estimator only passes params along, it does not inspect it. The structure of params is therefore entirely up to the developer.

所以使用 hparams 作为参数将没有用。


尝试 2:

我怀疑由于估算器使用 tensorflow.python.summary 而不是 v2 中默认的 tf.summary,因此 v1 记录的张量可能无法访问,因此,我也尝试使用

with tensorflow.python.summary.FileWriter(model_dir).as_default()

但是失败了 RuntimeError: tf.summary.FileWriter is not compatible with eager execution. Use tf.contrib.summary instead

更新:我运行它禁用了急切执行。现在,即使是 hparam 初始日志记录也没有发生。 tensorboard 中没有 hparams 选项卡,因为它因错误

而失败
E0129 13:03:07.656290 21584 hparams_plugin.py:104] HParams error: Can't find an HParams-plugin experiment data in the log directory. Note that it takes some time to scan the log directory; if you just started Tensorboard it could be that we haven't finished scanning it yet. Consider trying again in a few seconds.

有没有办法让 tensorboard 读取已经记录的度量张量并使用 hparams link 它们?

罪魁祸首似乎是

# This doesn't seem to compatible with Estimator API
hp.hparams_config(hparams=hparams_list, metrics=metrics_to_monitor)

只需调用 hparams 即可记录使用 tf.summary 记录的所有指标。然后在 tensorboard 中,你可以只过滤你需要的指标,然后比较试验。

with tf.summary.create_file_writer(train_folder).as_default():
    # params is a dict which contains
    # { 'learning_rate': 0.001, 'distance_margin': 0.5,...}
    hp.hparams(hparams=params))