训练后更快地进行预测 [增加预测视频 fps]

Question

使用 mobilenetV3Large 训练了一个模型，该模型执行分割过程，但在预测时间，其 处理时间 不是那么好。 大约 FPS：3.95 .

我想至少达到 20fps。还附上示例代码。谢谢！

from imutils.video import VideoStream
from imutils.video import FPS
import numpy as np
import imutils
import time
import cv2


model = load_model('model.h5', custom_objects={'loss': loss, "dice_coefficient": dice_coefficient}, compile = False)

cap = VideoStream(src=0).start()
# warm up the camera for a couple of seconds
time.sleep(2.0)

# Start the FPS timer
fps = FPS().start()

while True:

    frame = cap.read()

    # Resize each frame
    resized_image = cv2.resize(frame, (256, 256))

    resized_image = tf.image.convert_image_dtype((resized_image/255.0), dtype=tf.float32).numpy()
    mask = model.predict(np.expand_dims(resized_image[:,:,:3], axis=0))[0]

    # show the output frame
    cv2.imshow("Frame", mask)

    key = cv2.waitKey(1) & 0xFF
    # Press 'q' key to break the loop
    if key == ord("q"):
        break

    # update the FPS counter
    fps.update()

# stop the timer
fps.stop()

# Display FPS Information: Total Elapsed time and an approximate FPS over the entire video stream
print("[INFO] Elapsed Time: {:.2f}".format(fps.elapsed()))
print("[INFO] Approximate FPS: {:.2f}".format(fps.fps()))

# Destroy windows and cleanup
cv2.destroyAllWindows()
# Stop the video stream
cap.stop()

EDIT-1

进行 float16 量化后，将模型加载为 tflite_model 然后将输入（图像）输入模型。但是结果更慢！！这是正确的做法吗？

interpreter = tf.lite.Interpreter('tflite_model.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

....................... process .............

while True:
    
    .............  process ............
    
    interpreter.set_tensor(input_details[0]['index'], np.expand_dims(resized_image[:,:,:3], axis=0))
    interpreter.invoke()
    mask = interpreter.get_tensor(output_details[0]['index'])[0]
#     mask = model.predict(np.expand_dims(resized_image[:,:,:3], axis=0))[0]

    ............ display part ........

Answer 1

可以通过不同的方式使速度更快：

模型量化：

TensorFlow Lite 支持将权重转换为 16 位浮点数也许这是将模型保存在
中的最简单方法
```
tf.float16
```
或者用 float16 或 float8 重新训练会更快 https://www.tensorflow.org/lite/performance/post_training_float16_quant
模型蒸馏：您可以训练小模型，该模型将使用模型的损失函数进行训练，并且可以从大模型中学习所有内容。
模型剪枝你可以通过修剪来压缩你的模型，这样会更快。关于修剪你也可以阅读tensorflow文档

训练后更快地进行预测 [增加预测视频 fps]

Make prediction faster after training [ Increasing predicted video fps]

python

opencv

image

video-processing

tensorflow