将 SSD 对象检测模型转换为 TFLite 并将其从 float 量化为 uint8 以用于 EdgeTPU

Converting SSD object detection model to TFLite and quantize it from float to uint8 for EdgeTPU

我在将 SSD 对象检测模型转换为用于 EdgeTPU 的 uint8 TFLite 时遇到问题。

据我所知,我一直在不同的论坛、堆栈溢出线程和 github 问题中进行搜索,我认为我正在遵循正确的步骤。我的 jupyter notebook 一定有问题,因为我无法实现我的建议。

我正在与您分享我在 Jupyter Notebook 上解释的步骤。我想这样会更清楚。

#!/usr/bin/env python
# coding: utf-8

设置

这一步是克隆存储库。如果你之前做过一次,可以省略这一步。

import os
import pathlib

# Clone the tensorflow models repository if it doesn't already exist
if "models" in pathlib.Path.cwd().parts:
  while "models" in pathlib.Path.cwd().parts:
    os.chdir('..')
elif not pathlib.Path('models').exists():
  !git clone --depth 1 https://github.com/tensorflow/models

进口

需要的步骤:这只是为了导入

import matplotlib
import matplotlib.pyplot as plt
import pathlib
import os
import random
import io
import imageio
import glob
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage

import tensorflow as tf
import tensorflow_datasets as tfds


from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
#from object_detection.utils import colab_utils
from object_detection.utils import config_util
from object_detection.builders import model_builder

%matplotlib inline

正在下载友好模型

对于tflite推荐使用SSD网络。 我已经下载了以下模型,它是关于“物体检测”的。它适用于 320x320 图像。
# Download the checkpoint and put it into models/research/object_detection/test_data/

!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz
!tar -xf ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz
!if [ -d "models/research/object_detection/test_data/checkpoint" ]; then rm -Rf models/research/object_detection/test_data/checkpoint; fi
!mkdir models/research/object_detection/test_data/checkpoint
!mv ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/checkpoint models/research/object_detection/test_data/

用于为每个框添加正确标签的字符串列表。

PATH_TO_LABELS = '/home/jose/codeWorkspace-2.4.1/tf_2.4.1/models/research/object_detection/data/mscoco_label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

使用 TFLite

导出并运行

模型转换

在这一步我将 pb 保存的模型转换为 .tflite

!tflite_convert --saved_model_dir=/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model --output_file=/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model.tflite

模型量化(从 float 到 uint8)

转换模型后,我需要对其进行量化。原始模型选择一个浮点数作为张量输入。因为我想在 Edge TPU 上 运行 它,所以我需要输入和输出张量为 uint8。

正在生成校准数据集。

def representative_dataset_gen():
    folder = "/home/jose/codeWorkspace-2.4.1/tf_2.4.1/images_ssd_mb2_2"
    image_size = 320
    raw_test_data = []

    files = glob.glob(folder+'/*.jpeg')
    for file in files:
        image = Image.open(file)
        image = image.convert("RGB")
        image = image.resize((image_size, image_size))
        #Quantizing the image between -1,1;
        image = (2.0 / 255.0) * np.float32(image) - 1.0
        #image = np.asarray(image).astype(np.float32)
        image = image[np.newaxis,:,:,:]
        raw_test_data.append(image)

    for data in raw_test_data:
        yield [data]

(不要 运行 这个)。这是上面的步骤,但具有随机值

如果您没有数据集,您也可以引入随机生成的值,就像它是图像一样。这是我曾经这样做的代码:
####THIS IS A RANDOM-GENERATED DATASET#### 
def representative_dataset_gen():
    for _ in range(320):
      data = np.random.rand(1, 320, 320, 3)
      yield [data.astype(np.float32)]

要求模型转换

converter = tf.lite.TFLiteConverter.from_saved_model('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.SELECT_TF_OPS]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
converter.allow_custom_ops = True
converter.representative_dataset = representative_dataset_gen
tflite_model = converter.convert()

警告:

转换步骤returns一个警告。

WARNING:absl:For model inputs containing unsupported operations which cannot be quantized, the inference_input_type attribute will default to the original type. WARNING:absl:For model outputs containing unsupported operations which cannot be quantized, the inference_output_type attribute will default to the original type.

这让我觉得转换不正确。

保存模型

with open('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite'.format('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model'), 'wb') as w:
    w.write(tflite_model)
print("tflite convert complete! - {}/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite".format('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model'))

测试

测试 1:获取 TensorFlow 版本

我读到建议为此使用 nightly。所以在我的例子中,版本是 2.6.0

print(tf.version.VERSION)

测试 2:获取 input/output 张量详细信息

interpreter = tf.lite.Interpreter(model_path="/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite")
interpreter.allocate_tensors()

print(interpreter.get_input_details())
print("@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@")
print(interpreter.get_output_details())

测试 2 结果:

我得到以下信息:

[{'name': 'serving_default_input:0', 'index': 0, 'shape': array([ 1, 320, 320, 3], dtype=int32), 'shape_signature': array([ 1, 320, 320, 3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.007843137718737125, 127), 'quantization_parameters': {'scales': array([0.00784314], dtype=float32), 'zero_points': array([127], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

[{'name': 'StatefulPartitionedCall:31', 'index': 377, 'shape': array([ 1, 10, 4], dtype=int32), 'shape_signature': array([ 1, 10, 4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:32', 'index': 378, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:33', 'index': 379, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:34', 'index': 380, 'shape': array([1], dtype=int32), 'shape_signature': array([1], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

所以,我认为它没有正确量化它

正在将生成的模型转换为 EdgeTPU

!edgetpu_compiler -s /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite

jose@jose-VirtualBox:~/python-envs$ edgetpu_compiler -s /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite Edge TPU Compiler version 15.0.340273435

Model compiled successfully in 1136 ms.

Input model: /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite Input size: 3.70MiB Output model: model_full_integer_quant_edgetpu.tflite Output size: 4.21MiB On-chip memory used for caching model parameters: 3.42MiB On-chip memory remaining for caching model parameters: 4.31MiB Off-chip memory used for streaming uncached model parameters: 0.00B Number of Edge TPU subgraphs: 1 Total number of operations: 162 Operation log: model_full_integer_quant_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs. Number of operations that will run on Edge TPU: 112 Number of operations that will run on CPU: 50

Operator Count Status

LOGISTIC 1 Operation is otherwise supported, but not mapped due to some unspecified limitation DEPTHWISE_CONV_2D 14 More than one subgraph is not supported DEPTHWISE_CONV_2D 37 Mapped to Edge TPU QUANTIZE 1 Mapped to Edge TPU QUANTIZE 4 Operation is otherwise supported, but not mapped due to some unspecified limitation CONV_2D
58 Mapped to Edge TPU CONV_2D 14
More than one subgraph is not supported DEQUANTIZE
1 Operation is working on an unsupported data type DEQUANTIZE 1 Operation is otherwise supported, but not mapped due to some unspecified limitation CUSTOM 1
Operation is working on an unsupported data type ADD
2 More than one subgraph is not supported ADD
10 Mapped to Edge TPU CONCATENATION 1
Operation is otherwise supported, but not mapped due to some unspecified limitation CONCATENATION 1 More than one subgraph is not supported RESHAPE 2
Operation is otherwise supported, but not mapped due to some unspecified limitation RESHAPE 6
Mapped to Edge TPU RESHAPE 4 More than one subgraph is not supported PACK 4
Tensor has unsupported rank (up to 3 innermost dimensions mapped)

我准备的jupyter notebook可以在下面link找到:https://github.com/jagumiel/Artificial-Intelligence/blob/main/tensorflow-scripts/Step-by-step-explaining-problems.ipynb

有没有我遗漏的步骤?为什么没有导致我的转换?

非常感谢您。

@JaesungChung 回答的过程做得很好。

我的问题出在 运行 .tflite 模型的应用程序上。我将我的模型输出量化为 uint8,所以我必须重新缩放我获得的值以获得正确的结果。

即我有 10 个对象,因为我要求所有检测到的对象得分高于 0.5。我的结果没有缩放,所以检测到的对象分数可能是完美的 104。我不得不重新缩放这个数字除以 255。

绘制结果图表时也发生了同样的情况。所以我不得不将该数字除以高度和宽度。