是否可以将 TFLite 配置为 return 偏差量化为 int8 的模型？

Question

我正在与 Keras/Tensorflow 合作开发将部署到低端 MCU 的 ANN。为此，我使用 Tensorflow Lite 提供的 post-training 量化机制对原始 ANN 进行了量化。如果权重确实被量化为 int8，则偏差从 float 转换为 int32。考虑到我假装在 CMSIS-NN 中实现这个 ANN，这是一个问题，因为它们只支持 int8 和 int16 数据。

是否可以将 TF Lite 配置为也将偏差量化为 int8？下面是我正在执行的代码：

def quantizeToInt8(representativeDataset):
    # Cast the dataset to float32
    data = tf.cast(representativeDataset, tf.float32)
    data = tf.data.Dataset.from_tensor_slices((data)).batch(1)

    # Generator function that returns one data point per iteration
    def representativeDatasetGen():
        for inputValue in data:
            yield[inputValue]
    
    # ANN quantization
    model = tf.keras.models.load_model("C:/Users/miguel/Documents/Universidade/PhD/Code_Samples/TensorFlow/originalModel.h5")

    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.representative_dataset = representativeDatasetGen
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.target_spec.supported_types = [tf.int8]
    converter.inference_type = tf.int8
    converter.inference_input_type = tf.int8  # or tf.uint8
    converter.inference_output_type = tf.int8  # or tf.uint8
    tflite_quant_model = converter.convert()

    return tflite_quant_model

Answer 1

来自评论

It's not possible to configure TFLite to do that. Biases are intentionally int32 otherwise the quantization accuracy would not be good. In order to make this work, you'd have to add a new op or custom op and then come up with a custom quantization tooling all together.(paraphrased from Meghna Natraj).

是否可以将 TFLite 配置为 return 偏差量化为 int8 的模型？

Is it possible to configure TFLite to return a model with bias quantized to int8?

machine-learning

quantization

cmsis

tensorflow

tensorflow-lite