具有 tf.keras 的 Hparams 插件（tensorflow 2.0）

Question

我尝试按照 tensorflow docs 中的示例并设置超参数日志记录。它还提到，如果你使用 tf.keras，你可以只使用回调 hp.KerasCallback(logdir, hparams)。但是，如果我使用回调，我不会得到我的指标（只有结果）。

Answer 1

我成功了，但不完全确定神奇的词是什么。这是我的流程以防有帮助。

callbacks.append(hp.KerasCallback(log_dir, hparams))

HP_NUM_LATENT = hp.HParam('num_latent_dim', hp.Discrete([2, 5, 100])) 
hparams = {
   HP_NUM_LATENT: num_latent,
}

model = create_simple_model(latent_dim=hparams[HP_NUM_LATENT])  # returns compiled model
model.fit(x, y, validation_data=validation_data, 
          epochs=4,
          verbose=2,
          callbacks=callbacks)

Answer 2

诀窍是使用 TensorBoard 保存其验证日志的路径定义 Hparams 配置。

因此，如果您的 TensorBoard 回调设置为：

log_dir = 'path/to/training-logs'
tensorboard_cb = TensorBoard(log_dir=log_dir)

那么你应该像这样设置 Hparams:

hparams_dir = os.path.join(log_dir, 'validation')

with tf.summary.create_file_writer(hparams_dir).as_default():
    hp.hparams_config(
        hparams=HPARAMS,
        metrics=[hp.Metric('epoch_accuracy')]  # metric saved by tensorboard_cb
    )

hparams_cb = hp.KerasCallback(
    writer=hparams_dir,
    hparams=HPARAMS
)

Answer 3

我只是想补充一下以前的答案。如果您在 Colab 上的笔记本中使用 TensorBoard，问题可能不是您的代码引起的，而是由于 TensorBoard 在 Colab 上的运行方式。解决方案是杀死现有的 TensorBoard 并重新启动它。

如有错误请指正。

示例代码：

from tensorboard.plugins.hparams import api as hp

HP_LR = hp.HParam('learning_rate', hp.Discrete([1e-4, 5e-4, 1e-3]))
HPARAMS = [HP_LR]
# this METRICS does not seem to have any effects in my example as 
# hp uses epoch_accuracy and epoch_loss for both training and validation anyway.
METRICS = [hp.Metric('epoch_accuracy', group="validation", display_name='val_accuracy')]
# save the configuration
log_dir = '/content/logs/hparam_tuning'
with tf.summary.create_file_writer(log_dir).as_default():
  hp.hparams_config(hparams=HPARAMS, metrics=METRICS)


def fitness_func(hparams, seed):
  rng = random.Random(seed)

  # here we build the model
  model = tf.keras.Sequential(...)
  model.compile(..., metrics=['accuracy'])  # need to pass the metric of interest

  # set up callbacks
  _log_dir = os.path.join(log_dir, seed)
  tb_callbacks = tf.keras.callbacks.TensorBoard(_log_dir)  # log metrics
  hp_callbacks = hp.KerasCallback(_log_dir, hparams)  # log hparams

  # fit the model
  history = model.fit(
    ..., validation_data=(x_te, y_te), callbacks=[tb_callbacks, hp_callbacks])


rng = random.Random(0)
session_index = 0
# random search
num_session_groups = 4
sessions_per_group = 2
for group_index in range(num_session_groups):
  hparams = {h: h.domain.sample_uniform(rng) for h in HPARAMS}
  hparams_string = str(hparams)
  for repeat_index in range(sessions_per_group):
    session_id = str(session_index)
    session_index += 1
    fitness_func(hparams, session_id)

要检查是否存在任何现有的 TensorBoard 进程，运行 Colab 中的以下内容：

!ps ax | grep tensorboard

假设 TensorBoard 进程的 PID 是 5315。那么，

!kill 5315

和运行

# of course, replace the dir below with your log_dir
%tensorboard --logdir='/content/logs/hparam_tuning'

就我而言，在我如上所述重置 TensorBoard 后，它可以正确记录 model.compile 中指定的指标，即准确性。

Answer 4

因为这个我已经浪费了几个小时。我想补充一下 Julian 关于定义 hparams 配置的好话，你想用 hparams 记录的指标的标签，可能它在 hp.Metric(tag='epoch_accuracy', group='validation') 中的组应该与你捕获的指标之一相匹配凯拉斯model.fit(..., metrics=)。见 hparams_demo 的一个很好的例子

具有 tf.keras 的 Hparams 插件（tensorflow 2.0）

Hparams plugin with tf.keras (tensorflow 2.0)

tensorflow

tensorflow2.0