如何在 MonitoredTrainingSession 中获取 global_step?
How can I get the global_step in a MonitoredTrainingSession?
我 运行 在分布式 TensorFlow 中分发了一个 mnist 模型。我想监视 "manually" global_step 的演变以进行调试。在分布式 TensorFlow 设置中获取全局步骤的最佳且干净的方法是什么?
下面是我的代码
...
with tf.device(device):
images = tf.placeholder(tf.float32, [None, 784], name='image_input')
labels = tf.placeholder(tf.float32, [None], name='label_input')
data = read_data_sets(FLAGS.data_dir,
one_hot=False,
fake_data=False)
logits = mnist.inference(images, FLAGS.hidden1, FLAGS.hidden2)
loss = mnist.loss(logits, labels)
loss = tf.Print(loss, [loss], message="Loss = ")
train_op = mnist.training(loss, FLAGS.learning_rate)
hooks=[tf.train.StopAtStepHook(last_step=FLAGS.nb_steps)]
with tf.train.MonitoredTrainingSession(
master=target,
is_chief=(FLAGS.task_index == 0),
checkpoint_dir=FLAGS.log_dir,
hooks = hooks) as sess:
while not sess.should_stop():
xs, ys = data.train.next_batch(FLAGS.batch_size, fake_data=False)
sess.run([train_op], feed_dict={images:xs, labels:ys})
global_step_value = # ... what is the clean way to get this variable
通常一个好的做法是在图形定义过程中初始化全局步骤变量,例如global_step = tf.Variable(0, trainable=False, name='global_step')
。然后您可以使用 graph.get_tensor_by_name("global_step:0")
轻松获得您的全局步骤。
我 运行 在分布式 TensorFlow 中分发了一个 mnist 模型。我想监视 "manually" global_step 的演变以进行调试。在分布式 TensorFlow 设置中获取全局步骤的最佳且干净的方法是什么?
下面是我的代码
...
with tf.device(device):
images = tf.placeholder(tf.float32, [None, 784], name='image_input')
labels = tf.placeholder(tf.float32, [None], name='label_input')
data = read_data_sets(FLAGS.data_dir,
one_hot=False,
fake_data=False)
logits = mnist.inference(images, FLAGS.hidden1, FLAGS.hidden2)
loss = mnist.loss(logits, labels)
loss = tf.Print(loss, [loss], message="Loss = ")
train_op = mnist.training(loss, FLAGS.learning_rate)
hooks=[tf.train.StopAtStepHook(last_step=FLAGS.nb_steps)]
with tf.train.MonitoredTrainingSession(
master=target,
is_chief=(FLAGS.task_index == 0),
checkpoint_dir=FLAGS.log_dir,
hooks = hooks) as sess:
while not sess.should_stop():
xs, ys = data.train.next_batch(FLAGS.batch_size, fake_data=False)
sess.run([train_op], feed_dict={images:xs, labels:ys})
global_step_value = # ... what is the clean way to get this variable
通常一个好的做法是在图形定义过程中初始化全局步骤变量,例如global_step = tf.Variable(0, trainable=False, name='global_step')
。然后您可以使用 graph.get_tensor_by_name("global_step:0")
轻松获得您的全局步骤。