如果 batch_size 在 tf.layers.batch_normalization() 中等于 1,它会正常工作吗?

If the batch_size equals 1 in tf.layers.batch_normalization(), will it works correctly?

各位。为了我的目的,我正在使用 tensorflow 1.4 来训练像 U-net 这样的模型。由于本人硬件限制,训练时batch_size只能设置为1,否则会出现OOM错误。

我的问题来了。在这种情况下,当 batch_size 等于 1 时,tf.layers.batch_normalization() 是否会正常工作(比如移动平均值、移动方差、伽马、贝塔)?小 batch_size 会使它工作不稳定吗?

在我的工作中,我在训练时设置training=True,在测试时设置training=False。训练时,我使用

logits = mymodel.inference()
loss = tf.mean_square_error(labels, logits)
updata_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
    train_op = optimizer.minimize(loss)
...

saver = tf.train.Saver(tf.global_variables())
with tf.Session() as sess:
    sess.run(tf.group(tf.global_variables_initializer(), 
                      tf.local_variables_initializer()))
    sess.run(train_op)
    ...
    saver.save(sess, save_path, global_step)

测试时,我使用:

logits = model.inference()
saver = tf.train.Saver()

with tf.Session() as sess:
    saver.restore(sess, checkpoint)
    sess.run(tf.local_variables_initializer())
    results = sess.run(logits)

谁能告诉我我是不是用错了?在tf.layers.batch_normalization()?

中,batch_size等于1的影响有多大

任何帮助将不胜感激!提前致谢。

是的,tf.layers.batch_normalization() 适用于批量的单个元素。在这样的batch上做batch normalization实际上叫做instance normalization(即单个实例的normalization)

@Maxim 做得很好post about instance normalization if you want to know more. You can also find more theory on the web and in the literature, e.g. Instance Normalization: The Missing Ingredient for Fast Stylization