批量标准化层不起作用（tensorflow）

Question

我用tensorflow实现了一个网络，loss没有收敛。然后，我在网络中获取一些值，发现 BN 层不起作用。请看下图：

我们可以看到s2是s1进行batch normalization的结果，但是s2中的值还是很大的。我不知道出了什么问题。为什么s2中的值这么大？

我已将我的代码更新为 github。有兴趣的可以测试一下。

Answer 1

根据官方 tensorflow 文档 here,

when training, the moving_mean and moving_variance need to be updated. By default the update ops are placed in tf.GraphKeys.UPDATE_OPS, so they need to be executed alongside the train_op. Also, be sure to add any batch_normalization ops before getting the update_ops collection. Otherwise, update_ops will be empty, and training/inference will not work properly.

例如：

training = tf.placeholder(tf.bool, name="is_training")
# ...
x_norm = tf.layers.batch_normalization(x, training=training)
# ...
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
train_op = optimizer.minimize(loss)
train_op = tf.group([train_op, update_ops])

# or, you can also do something like this:
# with tf.control_dependencies(update_ops):
#    train_op = optimizer.minimize(loss)

因此，按照 tensorflow 文档中的说明获取更新操作非常重要，因为在训练期间必须更新层的移动方差和移动均值。如果不这样做，批量归一化将不起作用，网络也不会按预期进行训练。声明一个占位符来告诉网络它是在训练时间还是推理时间也很有用，因为在测试（或推理）时间期间，均值和方差是固定的。它们是使用先前计算的每个训练批次的均值和方差来估计的。

批量标准化层不起作用（tensorflow）

the batch normlization layer do not work (tensorflow)

tensorflow

batch-normalization