我怎样才能在边缘 tpu 上做简单的 matmul?

How can I do simple matmul on edge tpu?

我不知道如何使用 python api.

调用在珊瑚加速器上执行 matmul 的 .tflite 模型

.tflite 模型是从一些示例代码生成的 here。它使用 tf.lite.Interpreter() class 效果很好,但我不知道如何将其转换为与 edgetpu class 一起使用。我尝试 edgetpu.basic.basic_engine.BasicEngine() 将模型数据类型从 numpy.float32 更改为 numpy.uint8,但这没有帮助。我是 TensorFlow 的完全初学者,只想将我的 tpu 用于 matmul。

import numpy
import tensorflow as tf
import edgetpu
from edgetpu.basic.basic_engine import BasicEngine

def export_tflite_from_session(session, input_nodes, output_nodes, tflite_filename):
    print("Converting to tflite...")
    converter = tf.lite.TFLiteConverter.from_session(session, input_nodes, output_nodes)
    tflite_model = converter.convert()
    with open(tflite_filename, "wb") as f:
        f.write(tflite_model)
    print("Converted %s." % tflite_filename)

#This does matmul just fine but does not use the TPU
def test_tflite_model(tflite_filename, examples):
    print("Loading TFLite interpreter for %s..." % tflite_filename)
    interpreter = tf.lite.Interpreter(model_path=tflite_filename)
    interpreter.allocate_tensors()
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()
    print("input details: %s" % input_details)
    print("output details: %s" % output_details)

    for i, input_tensor in enumerate(input_details):
        interpreter.set_tensor(input_tensor['index'], examples[i])
    interpreter.invoke()
    model_output = []
    for i, output_tensor in enumerate(output_details):
        model_output.append(interpreter.get_tensor(output_tensor['index']))
    return model_output

#this should use the TPU, but I don't know how to run the model or if it needs
#further processing. One matrix can be constant for my use case
def test_tpu(tflite_filename,examples):
    print("Loading TFLite interpreter for %s..." % tflite_filename)
    #TODO edgetpu.basic
    interpreter = BasicEngine(tflite_filename)
    interpreter.allocate_tensors()#does not work...

def main():
    tflite_filename = "model.tflite"
    shape_a = (2, 2)
    shape_b = (2, 2)

    a = tf.placeholder(dtype=tf.float32, shape=shape_a, name="A")
    b = tf.placeholder(dtype=tf.float32, shape=shape_b, name="B")
    c = tf.matmul(a, b, name="output")

    numpy.random.seed(1234)
    a_ = numpy.random.rand(*shape_a).astype(numpy.float32)
    b_ = numpy.random.rand(*shape_b).astype(numpy.float32)
    with tf.Session() as session:
        session_output = session.run(c, feed_dict={a: a_, b: b_})
        export_tflite_from_session(session, [a, b], [c], tflite_filename)

    tflite_output = test_tflite_model(tflite_filename, [a_, b_])
    tflite_output = tflite_output[0]

    #test the TPU
    tflite_output = test_tpu(tflite_filename, [a_, b_])

    print("Input example:")
    print(a_)
    print(a_.shape)
    print(b_)
    print(b_.shape)
    print("Session output:")
    print(session_output)
    print(session_output.shape)
    print("TFLite output:")
    print(tflite_output)
    print(tflite_output.shape)
    print(numpy.allclose(session_output, tflite_output))

if __name__ == '__main__':
    main()

您只转换了一次模型,并且您的模型并未针对 Edge TPU 进行完全编译。来自 docs:

At the first point in the model graph where an unsupported operation occurs, the compiler partitions the graph into two parts. The first part of the graph that contains only supported operations is compiled into a custom operation that executes on the Edge TPU, and everything else executes on the CPU

模型必须满足几个特定要求:

  • quantization-aware 训练

  • 编译时恒定的张量大小和模型参数

  • 张量是 3 维或更小的。

  • 模型仅使用 Edge TPU 支持的操作。

有一个 online compiler as well as a CLI version 可用于将 .tflite 模型转换为 Edge TPU 兼容的 .tflite 模型。

您的代码也不完整。您已将您的模型传递给 class 此处:

interpreter = BasicEngine(tflite_filename)

但您实际上错过了 运行 张量推理的步骤:

output = RunInference(interpreter)