TensorFlow 模型正确预测图像,但不是来自实时视频流的帧?

TensorFlow model correctly predicting images, but not frames from real time video stream?

为什么我的 TensorFlow 模型正确预测 JPG 和 PNG 图像错误预测来自实时视频流的帧?实时视频流都被错误地 class 定义为 class 1.

尝试:我从实时视频流中保存了一张 PNG 图像。当我单独保存 PNG 图像并对其进行测试时,模型正确地 class 验证了它。当相似图像是实时视频流中的帧时,它会被错误地 class 化。 PNG 图像和实时视频流帧在视觉上具有相同的内容(背景、照明条件、摄像机角度等)。

我的模型结构:

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
rescaling_2 (Rescaling)      (None, 180, 180, 3)       0
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 180, 180, 16)      448
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 90, 90, 16)        0
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 90, 90, 32)        4640
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 45, 45, 32)        0
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 45, 45, 64)        18496
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 22, 22, 64)        0
_________________________________________________________________
flatten_1 (Flatten)          (None, 30976)             0
_________________________________________________________________
dense_2 (Dense)              (None, 128)               3965056
_________________________________________________________________
dense_3 (Dense)              (None, 3)                 387
=================================================================
Total params: 3,989,027
Trainable params: 3,989,027
Non-trainable params: 0
_________________________________________________________________
Found 1068 files belonging to 3 classes.

实时预测代码:(在Keertika的帮助下更新!)

def testModel(imageName):
  import cv2
  from PIL import Image
  from tensorflow.keras.preprocessing import image_dataset_from_directory
  batch_size = 32
  img_height = 180
  img_width = 180
  img = keras.preprocessing.image.load_img(
  imageName,
  target_size=(img_height, img_width),
  interpolation = "bilinear",
  color_mode = 'rgb'
  )
 
  #preprocessing different here
  img_array = keras.preprocessing.image.img_to_array(img)
  img_array = tf.expand_dims(img_array, 0) #Create a batch
 
  predictions = new_model.predict(img_array)
  score = predictions[0]
  classes = ['1', '2','3']
prediction = classes[np.argmax(score)]
 
  print(
      "This image {} most likely belongs to {} with a {:.2f} percent confidence."
      .format(imageName, classes[np.argmax(score)], 100 * np.max(score))
  )
 
  return prediction

训练代码:

#image_dataset_from_directory returns a tf.data.Dataset that yields batches of images from 
#the subdirectories class_a and class_b, together with labels 0 and 1.
from keras.preprocessing import image
directory_test = "/content/test"
tf.keras.utils.image_dataset_from_directory(
    directory_test, labels='inferred', label_mode='int',
    class_names=None, color_mode='rgb', batch_size=32, image_size=(256,
    256), shuffle=True, seed=None, validation_split=None, subset=None,
    interpolation='bilinear', follow_links=False,
    crop_to_aspect_ratio=False
)
 
tf.keras.utils.image_dataset_from_directory(directory_test, labels='inferred')
 
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  directory_test,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

实时预测代码中的整形是否影响准确性?我不明白为什么帧预测不正确,但单个 JPG 和 PNG 图像预测是正确的。感谢您的帮助!

实时预测不正确的原因是因为预处理。推理代码的预处理应始终与训练时使用的预处理相同。在您的实时预测代码中使用 tf.keras.preprocessing.image.load_img 但它需要图像路径来加载图像。因此您可以按名称 "sample.png" 保存每个帧并将此路径传递给 tf.keras.preprocessing.image.load_img。这应该可以解决问题。并使用调整大小方法 "bilinear" 因为它用于训练数据