在不离开 tf.Session() 的情况下显示 Tensorflow DQN 中的损失
Display loss in a Tensorflow DQN without leaving tf.Session()
我有一个 DQN 已全部设置好并可以正常工作,但我不知道如何在不离开 Tensorflow 会话的情况下显示损失。
我最初认为它涉及创建一个新函数或 class,但我不确定将它放在代码中的什么位置,以及具体将什么放入函数或 class。
observations = tf.placeholder(tf.float32, shape=[None, num_stops], name='observations')
actions = tf.placeholder(tf.int32,shape=[None], name='actions')
rewards = tf.placeholder(tf.float32,shape=[None], name='rewards')
# Model
Y = tf.layers.dense(observations, 200, activation=tf.nn.relu)
Ylogits = tf.layers.dense(Y, num_stops)
# sample an action from predicted probabilities
sample_op = tf.random.categorical(logits=Ylogits, num_samples=1)
# loss
cross_entropies = tf.losses.softmax_cross_entropy(onehot_labels=tf.one_hot(actions,num_stops), logits=Ylogits)
loss = tf.reduce_sum(rewards * cross_entropies)
# training operation
optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001, decay=.99)
train_op = optimizer.minimize(loss)
然后我 运行 网络正常工作。
with tf.Session() as sess:
'''etc. The network is run'''
sess.run(train_op, feed_dict={observations: observations_list,
actions: actions_list,
rewards: rewards_list})
我想将 train_op
中的 loss
显示给用户。
试试这个
loss, _ = sess.run([loss, train_op], feed_dict={observations: observations_list,
actions: actions_list,
rewards: rewards_list})
我有一个 DQN 已全部设置好并可以正常工作,但我不知道如何在不离开 Tensorflow 会话的情况下显示损失。
我最初认为它涉及创建一个新函数或 class,但我不确定将它放在代码中的什么位置,以及具体将什么放入函数或 class。
observations = tf.placeholder(tf.float32, shape=[None, num_stops], name='observations')
actions = tf.placeholder(tf.int32,shape=[None], name='actions')
rewards = tf.placeholder(tf.float32,shape=[None], name='rewards')
# Model
Y = tf.layers.dense(observations, 200, activation=tf.nn.relu)
Ylogits = tf.layers.dense(Y, num_stops)
# sample an action from predicted probabilities
sample_op = tf.random.categorical(logits=Ylogits, num_samples=1)
# loss
cross_entropies = tf.losses.softmax_cross_entropy(onehot_labels=tf.one_hot(actions,num_stops), logits=Ylogits)
loss = tf.reduce_sum(rewards * cross_entropies)
# training operation
optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001, decay=.99)
train_op = optimizer.minimize(loss)
然后我 运行 网络正常工作。
with tf.Session() as sess:
'''etc. The network is run'''
sess.run(train_op, feed_dict={observations: observations_list,
actions: actions_list,
rewards: rewards_list})
我想将 train_op
中的 loss
显示给用户。
试试这个
loss, _ = sess.run([loss, train_op], feed_dict={observations: observations_list,
actions: actions_list,
rewards: rewards_list})