LSTM 模型的 Int8 量化。不管是哪个版本，我运行都成issues

Question

我想使用生成器来量化 LSTM 模型。

问题

我从问题开始，因为这个问题很长 post。我实际上想知道您是否已经通过 post 训练量化来量化 (int8) LSTM 模型。

我尝试了不同的 TF 版本，但总是遇到错误。以下是我的一些尝试。也许您看到我犯的错误或有建议。谢谢

工作部分

预计输入为（批次，1,45）。 运行使用未量化模型进行推理运行很好 。模型和 csv 可以在这里找到：
csv 文件：https://mega.nz/file/5FciFDaR#Ev33Ij124vUmOF02jWLu0azxZs-Yahyp6PPGOqr8tok
模型文件：https://mega.nz/file/UAMgUBQA#oK-E0LjZ2YfShPlhHN3uKg8t7bALc2VAONpFirwbmys

import tensorflow as tf
import numpy as np
import pathlib as path
import pandas as pd  

def reshape_for_Lstm(data):    
    timesteps=1
    samples=int(np.floor(data.shape[0]/timesteps))
    data=data.reshape((samples,timesteps,data.shape[1]))   #samples, timesteps, sensors     
    return data

if __name__ == '__main__':

#GET DATA
    import pandas as pd
    data=pd.read_csv('./test_x_data_OOP3.csv', index_col=[0])
    data=np.array(data)
    data=reshape_for_Lstm(data)  
    
#LOAD MODEL
    saved_model_dir= path.Path.cwd() / 'model' / 'singnature_model_tf_2.7.0-dev20210914'    
    model=tf.keras.models.load_model(saved_model_dir)

# INFERENCE
    [yhat,yclass] = model.predict(data)    
    Yclass=[np.argmax(yclass[i],0) for i in range(len(yclass))] # get final class
    
    print('all good')

变量data的shape和dtype是(20000,1,45), float64

哪里出错了

现在我想量化模型。但是根据 TensorFlow 版本我运行进入不同的错误。

我使用的代码选项合并如下：

    converter=tf.lite.TFLiteConverter.from_saved_model('./model/singnature_model_tf_2.7.0-dev20210914')
    converter.representative_dataset = batch_generator
    converter.optimizations = [tf.lite.Optimize.DEFAULT]         

    converter.experimental_new_converter = False  
   
    #converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] 
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.TFLITE_BUILTINS]
    #converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
    
    #converter._experimental_lower_tensor_list_ops = False

    converter.target_spec.supported_types = [tf.int8]
    quantized_tflite_model = converter.convert()

张量流 2.2

使用 Git 中经常建议的 TF 2.2，我运行进入 tflite 中不受支持的运算符。使用 tf2.2 created model 来确保版本支持。这里只支持TOCO转换。

Some of the operators in the model are not supported by the standard TensorFlow Lite runtime and are not recognized by TensorFlow.

错误不依赖于 converter.target_spec.supported_ops 选项。因此我找不到解决方案。 allow_custom_ops只是转移了问题。那里有 quite some git issues（只是一些例子），但所有建议的选项都不起作用。
一种是尝试新的 MILR 转换器，但是，在 2.2 中，MILR was not done yet.

的整数转换

所以让我们试试更新的版本

张量流 2.5.0

然后我尝试了一个经过严格审查的版本。在这里，无论 converter.target_spec.supported_ops I 运行在以下错误使用 MLIR 转换：

in the calibrator.py

ValueError: Failed to parse the model: pybind11::init(): factory function returned nullptr.

solution on Git是使用TF==2.2.0版本

使用 TOCO 转换时，出现以下错误：

tensorflow/lite/toco/allocate_transient_arrays.cc:181] An array, StatefulPartitionedCall/StatefulPartitionedCall/model/lstm/TensorArrayUnstack/TensorListFromTensor, still does not have a known data type after all graph transformations have run. Fatal Python error: Aborted

我没有找到关于这个错误的任何信息。也许在2.6

中解决了

张量流 2.6.0

这里，无论我用哪个converter.target_spec.supported_ops，我运行都出现了如下错误：

ValueError: Failed to parse the model: Only models with a single subgraph are supported, model had 5 subgraphs.

该模型是一个五层模型。所以看起来每一层都被看作是一个子图。我没有找到关于如何将它们合并到一个子图中的答案。 The issue is apparently with 2.6.0 and is solved in 2.7 那么，让我们尝试每晚构建。

TensorFlow 2.7-nightly（已尝试 2.7.0-dev20210914 和 2.7.0-dev20210921）

这里要用Python 3.7 as 3.6 is no longer supported

这里就要用到

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]

然而，即使是这样说的

converter._experimental_lower_tensor_list_ops = False

应该设置，好像没必要

这里的问题是，据我所知，tf.lite.OpsSet.SELECT_TF_OPS 在 _feed_tensor() 函数中调用 calibrator.py. In the calibrator.py the representative_dataset is expecting specific generator data. From line 93 onwards 生成器需要一个字典、列表或元组。在 tf.lite.RepresentativeDataset 函数描述或 tflite class description 中，它指出数据集应该看起来与模型的输入相同。在我的例子中（大多数情况下）只是一个正确维度的 numpy 数组。

在这里我可以尝试将我的数据转换成一个元组，但是，这似乎不对。或者这真的是要走的路吗？

非常感谢您阅读所有这些内容。如果我找到答案，我当然会更新 post

Answer 1

如果可能，您可以尝试修改您的 LSTM，以便将其转换为 TFLite 的融合 LSTM 运算符。 https://www.tensorflow.org/lite/convert/rnn它支持基本融合 LSTM 和 UnidirectionalSequenceLSTM 运算符的全整数量化。

Answer 2

我和你有同样的问题，我仍在努力解决它，但我注意到我们的代码有一些不同，所以分享它可能会有用。

我使用的是 TF 2.7.0，使用时转换工作正常：

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS, tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

无论如何，据我所知，使用这些选项（如您所提到的）并不能保证您对模型进行完全量化；因此您可能无法像 Google Coral 那样将其完全部署在微控制器或 TPU 系统上。

使用官方指南推荐的转换选项进行完全量化时：

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

转换失败。

我最近成功解决了这个问题！配置转换器时需要额外添加一行代码：

converter.target_spec.supported_types = [tf.int8]

这是我遵循的教程的 link：https://colab.research.google.com/github/google-coral/tutorials/blob/master/train_lstm_timeseries_ptq_tf2.ipynb#scrollTo=EBRDh9SZVBX1

LSTM 模型的 Int8 量化。不管是哪个版本，我运行都成issues

Int8 quantization of a LSTM model. No matter which version, I run into issues

python

quantization

tensorflow

tensorflow-lite

tensorflow2.0