关于Batch Normalization的使用
On the use of Batch Normalization
我正在努力确保我将批量归一化层正确地合并到模型中。
下面的代码片段说明了我在做什么。
- 这是批量归一化的恰当使用吗?
- 在推理时,我如何访问每个批归一化层中的移动平均值以确保它们正在加载?
列表项
import tensorflow.v1.compat as tf
from model import Model
# Sample batch normalization layer in the Model class
x_preBN = ...
x_postBN = tf.layers.batch_normalization(inputs=x_preBN,
center=True,
scale=True,
momentum=0.9,
training=(self.mode == 'train'))
# During training:
model = Model(mode='train')
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.Session() as sess:
for it in range(max_iterations):
# Training step + update of BN moving statistics
sess.run([train_step, extra_update_ops], feed_dict=...)
# Store checkpoint
if ii % num_checkpoint_steps == 0:
saver.save(sess,
os.path.join(model_dir, 'checkpoint'),
global_step=it)
# During inference:
model = Model(mode='eval')
with tf.Session() as sess:
saver.restore(sess, os.path.join(model_dir, 'checkpoint-???'))
acc = sess.run(model.accuracy, feed_dict=...)
模型实例化后,可以获得所有全局变量的列表
model = Model(mode='eval')
saver = tf.train.Saver()
print(tf.global_variables())
特定层的批量归一化变量如下所示:gamma 和 beta 是可训练的,而移动统计量不是(因此需要在训练期间指定 extra_update_ops)。
<tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/gamma:0' shape=(16,) dtype=float32>,
<tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/beta:0' shape=(16,) dtype=float32>,
<tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/moving_mean:0' shape=(16,) dtype=float32>,
<tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/moving_variance:0' shape=(16,) dtype=float32>
它们可以照常访问:
ma = <tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/moving_mean:0' shape=(16,) dtype=float32>
with tf.Session() as sess:
saver.restore(sess, model_dir)
print(sess.run(ma))
我正在努力确保我将批量归一化层正确地合并到模型中。
下面的代码片段说明了我在做什么。
- 这是批量归一化的恰当使用吗?
- 在推理时,我如何访问每个批归一化层中的移动平均值以确保它们正在加载?
列表项
import tensorflow.v1.compat as tf
from model import Model
# Sample batch normalization layer in the Model class
x_preBN = ...
x_postBN = tf.layers.batch_normalization(inputs=x_preBN,
center=True,
scale=True,
momentum=0.9,
training=(self.mode == 'train'))
# During training:
model = Model(mode='train')
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.Session() as sess:
for it in range(max_iterations):
# Training step + update of BN moving statistics
sess.run([train_step, extra_update_ops], feed_dict=...)
# Store checkpoint
if ii % num_checkpoint_steps == 0:
saver.save(sess,
os.path.join(model_dir, 'checkpoint'),
global_step=it)
# During inference:
model = Model(mode='eval')
with tf.Session() as sess:
saver.restore(sess, os.path.join(model_dir, 'checkpoint-???'))
acc = sess.run(model.accuracy, feed_dict=...)
模型实例化后,可以获得所有全局变量的列表
model = Model(mode='eval')
saver = tf.train.Saver()
print(tf.global_variables())
特定层的批量归一化变量如下所示:gamma 和 beta 是可训练的,而移动统计量不是(因此需要在训练期间指定 extra_update_ops)。
<tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/gamma:0' shape=(16,) dtype=float32>,
<tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/beta:0' shape=(16,) dtype=float32>,
<tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/moving_mean:0' shape=(16,) dtype=float32>,
<tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/moving_variance:0' shape=(16,) dtype=float32>
它们可以照常访问:
ma = <tf.Variable 'unit_1_1/residual_only_activation/batch_normalization/moving_mean:0' shape=(16,) dtype=float32>
with tf.Session() as sess:
saver.restore(sess, model_dir)
print(sess.run(ma))