在量化神经网络中使用输入图像的标准化

Standartization for input images using in quantized neural networks

我正在使用量化神经网络(需要具有像素 [0, 255] 的输入图像)一段时间。对于 ssd_mobilenet_v1.tflite 模型,尽管 https://tfhub.dev/tensorflow/lite-model/ssd_mobilenet_v1/1/metadata/2 给出了以下标准化参数:

 mean: 127.5
 std : 127.5

因此,使用此参数,常用公式 normalized_input = (input - mean) / std 对我来说没有意义。当像素值小于 128 时,括号变为 0 并且归一化输入也为 0。所以 128 下的每个值都会导致黑色像素。这不可能是对的还是我错了?

感谢您的帮助。我很想在这里进行讨论。

亲切的问候克里斯

我会说张量中的每个 value 都根据 mean 和 std 归一化导致黑色像素,这是完全正常的 :

import tensorflow as tf

mean = 127.5
std = 127.5
input = tf.concat([tf.random.uniform((1, 2, 2, 2)), tf.reshape(tf.repeat(tf.constant(128.0), repeats=4), (1, 2, 2, 1))], axis=-1)
normalized_input = (input - mean) / std
print(input)
print(normalized_input)
tf.Tensor(
[[[[  0.50647175   0.20693159 128.        ]
   [  0.18777049   0.9095379  128.        ]]

  [[  0.42894745   0.76806736 128.        ]
   [  0.58564055   0.31613588 128.        ]]]], shape=(1, 2, 2, 3), dtype=float32)
tf.Tensor(
[[[[-0.9960277  -0.998377    0.00392157]
   [-0.9985273  -0.99286634  0.00392157]]

  [[-0.99663574 -0.99397594  0.00392157]
   [-0.99540675 -0.9975205   0.00392157]]]], shape=(1, 2, 2, 3), dtype=float32)

我经常遇到基于整个图像数据集计算均值和标准差并根据这些措施对图像进行标准化的项目:

import tensorflow as tf
import matplotlib.pyplot as plt

input = tf.concat([tf.random.uniform((1, 2, 2, 2)), tf.reshape(tf.repeat(tf.constant(128.0), repeats=4), (1, 2, 2, 1))], axis=-1)
normalized_input = (input - tf.reduce_mean(input, keepdims=True)) / tf.math.reduce_std(input, keepdims=True)

print(input)
print(normalized_input)
plt.imshow(tf.squeeze(input, axis=0))
plt.imshow(tf.squeeze(normalized_input, axis=0))
tf.Tensor(
[[[[7.1283507e-01 6.4363706e-01 1.2800000e+02]
   [1.5691042e-02 2.3734951e-01 1.2800000e+02]]

  [[6.6603470e-01 1.3576746e-01 1.2800000e+02]
   [3.1267488e-01 9.6504271e-01 1.2800000e+02]]]], shape=(1, 2, 2, 3), dtype=float32)
tf.Tensor(
[[[[-0.70291406 -0.7040649   1.414201  ]
   [-0.71450937 -0.7108226   1.414201  ]]

  [[-0.70369244 -0.71251214  1.414201  ]
   [-0.7095697  -0.69871914  1.414201  ]]]], shape=(1, 2, 2, 3), dtype=float32)

在许多其他项目中,您也只会看到 uint8 图像被缩放到 [0, 1] 范围内,这实际上意味着每个图像都除以 255。查看此 了解更多详细信息。

抱歉,一个人!!!我认为 Tensorflow 的 Normalize Fn 是考虑 beta、gamma 和 sigma 值的分数 Fn。

model = tf.keras.models.Sequential([
    tf.keras.layers.InputLayer(input_shape=(1, 32, 32, 3)), # input shape to have value 25088 but received input with shape (None, 784) 
    tf.keras.layers.Normalization(mean=3., variance=2. ,name='Layer_1'),
    tf.keras.layers.Normalization(mean=4., variance=6. ,name='Layer_2'),
    tf.keras.layers.Dense(256, activation='relu' ,name='Layer_3'),
])

model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(6, activation=tf.nn.softmax ,name='Layer_4'))
model.summary()

with tf.compat.v1.variable_scope('Layer_1', reuse=tf.compat.v1.AUTO_REUSE):                 
            v2 = tf.compat.v1.get_variable('v2', shape=[256])       # <tf.Variable 'Layer_1/v2:0' shape=(256,) dtype=float32, numpy=array([-0.06715409,  0.10130859,  0.05591007, -0.05931217,  0.10036706, ...
            x1 = tf.compat.v1.get_variable('x', shape=[256])        # <tf.Variable 'Layer_1/x:0' shape=(256,) dtype=float32, numpy=array([-6.63143843e-02,  3.17198113e-02,  1.04614533e-01, -2.30028257e-02, ...
            y1 = tf.compat.v1.get_variable('y', shape=[256])        # <tf.Variable 'Layer_1/y:0' shape=(256,) dtype=float32, numpy=array([-0.10782533,  0.01488321, -0.04950972, -0.09561327,  0.10698273, ...
            y2 = tf.compat.v1.get_variable('y_', shape=[256])       # <tf.Variable 'Layer_1/y_:0' shape=(256,) dtype=float32, numpy=array([-0.04931336, -0.10670284, -0.10054329, -0.09619174,  0.08752564, ...
            mu = tf.compat.v1.get_variable('mu', shape=[256])       # <tf.Variable 'Layer_1/mu:0' shape=(256,) dtype=float32, numpy=array([-0.06098992,  0.02202646, -0.05624849,  0.0602672 , -0.02878931, ...
            sigma = tf.compat.v1.get_variable('sigma', shape=[256]) # <tf.Variable 'Layer_1/sigma:0' shape=(256,) dtype=float32, numpy=array([ 2.84786597e-02,  1.00004725e-01, -8.51654559e-02, -5.34656569e-02, ...
            gamma = tf.compat.v1.get_variable('gamma', shape=[256]) # <tf.Variable 'Layer_1/gamma:0' shape=(256,) dtype=float32, numpy=array([ 0.10177503,  0.04634983, -0.02325767,  0.04158259,  0.10051229, ...
            beta = tf.compat.v1.get_variable('beta', shape=[256])   # <tf.Variable 'Layer_1/beta:0' shape=(256,) dtype=float32, numpy=array([-7.85651207e-02, -4.94908020e-02,  8.88925046e-03,  9.37148184e-03, ...