将 .tflite 与 iOS 和 GPU 结合使用

Question

我创建了一个基于 MobilenetV2 的新 tflite 模型。它在 iOS 上使用 CPU 无需量化就可以很好地工作。应该说 TensorFlow 团队做得很好，非常感谢。

很遗憾，延迟存在问题。我使用 iPhone5s 来测试我的模型，所以我得到以下 CPU:

的结果

500ms for MobilenetV2 with 224*224 input image.
250-300ms for MobilenetV2 with 160*160 input image.

我使用了以下 pod 'TensorFlowLite'，'~> 1.13.1'

这还不够，所以我已经阅读了与优化相关的TF文档（post trainig quantization）。我想我需要使用 Float16 或 UInt8 量化和 GPU 委托（参见 https://www.tensorflow.org/lite/performance/post_training_quantization）。我使用 Tensorflow v2.1.0 来训练和量化我的模型。

Float16量化权重（我用的是MobilenetV2模型经过Float16量化）

https://github.com/tensorflow/examples/tree/master/lite/examples/image_segmentation/ios

pod 'TensorFlowLiteSwift'，'0.0.1-每晚'

没有错误，但模型不工作

pod 'TensorFlowLiteSwift', '2.1.0'

2020-05-01 21:36:13.578369+0300 TFL 分割[6367:330410] 已初始化 TensorFlow Lite 运行时。 2020-05-01 21:36:20.877393+0300 TFL 分段[6367:330397] 命令缓冲区的执行由于执行期间的错误而中止。导致 GPU 挂起错误（IOAF 代码 3）

权重和激活的全整数量化

pod ‘TensorFlowLiteGpuExperimental’

代码示例：https://github.com/makeml-app/MakeML-Nails/tree/master/Segmentation%20Nails

我用的是uint8量化后的MobilenetV2模型

GpuDelegateOptions options;
    options.allow_precision_loss = true;
    options.wait_type = GpuDelegateOptions::WaitType::kActive;

    //delegate = NewGpuDelegate(nullptr);
    delegate = NewGpuDelegate(&options);

    if (interpreter->ModifyGraphWithDelegate(delegate) != kTfLiteOk)

Segmentation Live[6411:331887] [DYMTLInitPlatform] 平台初始化成功已加载模型 1 已解决 reporterDidn't find op for builtin opcode 'PAD' version '2'

是否可以在 IOS 上以某种方式使用 MObilenetV2 量化模型？希望我犯了一些错误 :) 这是可能的。

此致，德米特里

Answer 1

抱歉，文档已过时 - GPU 委托应包含在 TensorFlowLiteSwift 2.1.0 中。但是，看起来您正在使用 C API，因此取决于 TensorFlowLiteC 就足够了。

MobileNetV2 在 iOS 中使用 TFLite 运行时，如果我没记错的话它没有 PAD 操作。你能附上你的模型文件吗？根据提供的信息，很难看出是什么导致了错误。作为完整性检查，您可以从此处获取 quant/non-quant 版本的 MobileNetV2：https://www.tensorflow.org/lite/guide/hosted_models

对于 int8 量化模型 - afaik GPU 委托仅适用于 FP32 和（可能）FP16 输入。

Answer 2

这是一个 link 到 GITHUB 的问题，答案是：https://github.com/tensorflow/tensorflow/issues/39101

将 .tflite 与 iOS 和 GPU 结合使用

Use .tflite with iOS and GPU

gpu

quantization

ios

deep-learning

tensorflow