TensorflowLite 中的量化卷积层操作

Quantized Convolution Layer Operation in TensorflowLite

我想了解在 TensorflowLite 中量化模型的卷积层中完成的基本操作。

作为基准，我选择了预训练的 Tensorflow 模型，EfficientNet-lite0-int8 and used a sample image to serve as input for model's inference. Thereinafter, I managed to extract the output tensor of the first fused ReLU6 Convolution Layer and compared this output with that of my custom python implementation on this。

两个张量之间的偏差很大，我无法解释的是Tensorflow的输出张量不在预期的[0,6]范围内（我预计是因为Conv中的融合ReLU6层层）。

能否请您提供更详细的量化融合 Relu6 Conv2D 层在 TensorflowLite 中的操作说明？

之后，仔细研究 Tensorflow 的 github 存储库，我找到了 kernel_util.cc 文件和 CalculateActivationRangeUint8 函数。所以使用这个函数，我设法理解了为什么量化融合 ReLu6 Conv2D 层的输出张量没有在 [0, 6] 之间剪切，而是在 [-128, 127] 值之间剪切。作为记录，我设法通过一些简单的步骤在 Python 中实现了 Conv2D 层的操作。

首先，您必须使用 interpreter.get_tensor_details() 命令获取层的参数（内核、偏差、尺度、偏移量）并使用 GetQuantizedConvolutionMultipler 和 QuantizeMultiplierSmallerThanOne 函数计算 output_multiplier。
之后，在填充之前从输入层减去输入偏移量并实现简单的卷积。
稍后，您需要使用 MultiplyByQuantizedMultiplierSmallerThanOne 函数，该函数使用 gemmlowp/fixedpoint.h 库中的 SaturatingRoundingDoublingHighMul 和 RoundingDivideByPOT。
最后，将 output_offset 添加到结果中并使用从 CalculateActivationRangeUint8 函数中获取的值对其进行裁剪。

Link of the issue on project's github page

TensorflowLite 中的量化卷积层操作

Quantized Convolution Layer Operation in TensorflowLite

python

quantization

convolution

tensorflow

tensorflow-lite