训练后更快地进行预测 [增加预测视频 fps]

Make prediction faster after training [ Increasing predicted video fps]

使用 mobilenetV3Large 训练了一个模型,该模型执行 分割 过程,但在预测时间,其 处理时间 不是那么好。 大约 FPS:3.95 .

我想至少达到 20fps。还附上示例代码。谢谢!

from imutils.video import VideoStream
from imutils.video import FPS
import numpy as np
import imutils
import time
import cv2


model = load_model('model.h5', custom_objects={'loss': loss, "dice_coefficient": dice_coefficient}, compile = False)

cap = VideoStream(src=0).start()
# warm up the camera for a couple of seconds
time.sleep(2.0)

# Start the FPS timer
fps = FPS().start()

while True:

    frame = cap.read()

    # Resize each frame
    resized_image = cv2.resize(frame, (256, 256))

    resized_image = tf.image.convert_image_dtype((resized_image/255.0), dtype=tf.float32).numpy()
    mask = model.predict(np.expand_dims(resized_image[:,:,:3], axis=0))[0]

    # show the output frame
    cv2.imshow("Frame", mask)

    key = cv2.waitKey(1) & 0xFF
    # Press 'q' key to break the loop
    if key == ord("q"):
        break

    # update the FPS counter
    fps.update()

# stop the timer
fps.stop()

# Display FPS Information: Total Elapsed time and an approximate FPS over the entire video stream
print("[INFO] Elapsed Time: {:.2f}".format(fps.elapsed()))
print("[INFO] Approximate FPS: {:.2f}".format(fps.fps()))

# Destroy windows and cleanup
cv2.destroyAllWindows()
# Stop the video stream
cap.stop()

EDIT-1

进行 float16 量化后,将模型加载为 tflite_model 然后将输入(图像)输入模型。但是结果更慢!!这是正确的做法吗?

interpreter = tf.lite.Interpreter('tflite_model.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

....................... process .............

while True:
    
    .............  process ............
    
    interpreter.set_tensor(input_details[0]['index'], np.expand_dims(resized_image[:,:,:3], axis=0))
    interpreter.invoke()
    mask = interpreter.get_tensor(output_details[0]['index'])[0]
#     mask = model.predict(np.expand_dims(resized_image[:,:,:3], axis=0))[0]

    ............ display part ........

可以通过不同的方式使速度更快:

  1. 模型量化:

    TensorFlow Lite 支持将权重转换为 16 位浮点数 也许这是将模型保存在

    中的最简单方法
    tf.float16
    

    或者用 float16 或 float8 重新训练会更快 https://www.tensorflow.org/lite/performance/post_training_float16_quant

  2. 模型蒸馏: 您可以训练小模型,该模型将使用模型的损失函数进行训练,并且可以从大模型中学习所有内容。

  3. 模型剪枝 你可以通过修剪来压缩你的模型,这样会更快。关于修剪你也可以阅读tensorflow文档