TensorFlow 模型正确预测图像,但不是来自实时视频流的帧?
TensorFlow model correctly predicting images, but not frames from real time video stream?
为什么我的 TensorFlow 模型正确预测 JPG 和 PNG 图像但错误预测来自实时视频流的帧?实时视频流都被错误地 class 定义为 class 1.
尝试:我从实时视频流中保存了一张 PNG 图像。当我单独保存 PNG 图像并对其进行测试时,模型正确地 class 验证了它。当相似图像是实时视频流中的帧时,它会被错误地 class 化。 PNG 图像和实时视频流帧在视觉上具有相同的内容(背景、照明条件、摄像机角度等)。
我的模型结构:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
rescaling_2 (Rescaling) (None, 180, 180, 3) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 180, 180, 16) 448
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 90, 90, 16) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 90, 90, 32) 4640
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 45, 45, 32) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 45, 45, 64) 18496
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 22, 22, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 30976) 0
_________________________________________________________________
dense_2 (Dense) (None, 128) 3965056
_________________________________________________________________
dense_3 (Dense) (None, 3) 387
=================================================================
Total params: 3,989,027
Trainable params: 3,989,027
Non-trainable params: 0
_________________________________________________________________
Found 1068 files belonging to 3 classes.
实时预测代码:(在Keertika的帮助下更新!)
def testModel(imageName):
import cv2
from PIL import Image
from tensorflow.keras.preprocessing import image_dataset_from_directory
batch_size = 32
img_height = 180
img_width = 180
img = keras.preprocessing.image.load_img(
imageName,
target_size=(img_height, img_width),
interpolation = "bilinear",
color_mode = 'rgb'
)
#preprocessing different here
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) #Create a batch
predictions = new_model.predict(img_array)
score = predictions[0]
classes = ['1', '2','3']
prediction = classes[np.argmax(score)]
print(
"This image {} most likely belongs to {} with a {:.2f} percent confidence."
.format(imageName, classes[np.argmax(score)], 100 * np.max(score))
)
return prediction
训练代码:
#image_dataset_from_directory returns a tf.data.Dataset that yields batches of images from
#the subdirectories class_a and class_b, together with labels 0 and 1.
from keras.preprocessing import image
directory_test = "/content/test"
tf.keras.utils.image_dataset_from_directory(
directory_test, labels='inferred', label_mode='int',
class_names=None, color_mode='rgb', batch_size=32, image_size=(256,
256), shuffle=True, seed=None, validation_split=None, subset=None,
interpolation='bilinear', follow_links=False,
crop_to_aspect_ratio=False
)
tf.keras.utils.image_dataset_from_directory(directory_test, labels='inferred')
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
directory_test,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
实时预测代码中的整形是否影响准确性?我不明白为什么帧预测不正确,但单个 JPG 和 PNG 图像预测是正确的。感谢您的帮助!
实时预测不正确的原因是因为预处理。推理代码的预处理应始终与训练时使用的预处理相同。在您的实时预测代码中使用 tf.keras.preprocessing.image.load_img 但它需要图像路径来加载图像。因此您可以按名称 "sample.png" 保存每个帧并将此路径传递给 tf.keras.preprocessing.image.load_img。这应该可以解决问题。并使用调整大小方法 "bilinear" 因为它用于训练数据
为什么我的 TensorFlow 模型正确预测 JPG 和 PNG 图像但错误预测来自实时视频流的帧?实时视频流都被错误地 class 定义为 class 1.
尝试:我从实时视频流中保存了一张 PNG 图像。当我单独保存 PNG 图像并对其进行测试时,模型正确地 class 验证了它。当相似图像是实时视频流中的帧时,它会被错误地 class 化。 PNG 图像和实时视频流帧在视觉上具有相同的内容(背景、照明条件、摄像机角度等)。
我的模型结构:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
rescaling_2 (Rescaling) (None, 180, 180, 3) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 180, 180, 16) 448
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 90, 90, 16) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 90, 90, 32) 4640
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 45, 45, 32) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 45, 45, 64) 18496
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 22, 22, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 30976) 0
_________________________________________________________________
dense_2 (Dense) (None, 128) 3965056
_________________________________________________________________
dense_3 (Dense) (None, 3) 387
=================================================================
Total params: 3,989,027
Trainable params: 3,989,027
Non-trainable params: 0
_________________________________________________________________
Found 1068 files belonging to 3 classes.
实时预测代码:(在Keertika的帮助下更新!)
def testModel(imageName):
import cv2
from PIL import Image
from tensorflow.keras.preprocessing import image_dataset_from_directory
batch_size = 32
img_height = 180
img_width = 180
img = keras.preprocessing.image.load_img(
imageName,
target_size=(img_height, img_width),
interpolation = "bilinear",
color_mode = 'rgb'
)
#preprocessing different here
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) #Create a batch
predictions = new_model.predict(img_array)
score = predictions[0]
classes = ['1', '2','3']
prediction = classes[np.argmax(score)]
print(
"This image {} most likely belongs to {} with a {:.2f} percent confidence."
.format(imageName, classes[np.argmax(score)], 100 * np.max(score))
)
return prediction
训练代码:
#image_dataset_from_directory returns a tf.data.Dataset that yields batches of images from
#the subdirectories class_a and class_b, together with labels 0 and 1.
from keras.preprocessing import image
directory_test = "/content/test"
tf.keras.utils.image_dataset_from_directory(
directory_test, labels='inferred', label_mode='int',
class_names=None, color_mode='rgb', batch_size=32, image_size=(256,
256), shuffle=True, seed=None, validation_split=None, subset=None,
interpolation='bilinear', follow_links=False,
crop_to_aspect_ratio=False
)
tf.keras.utils.image_dataset_from_directory(directory_test, labels='inferred')
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
directory_test,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
实时预测代码中的整形是否影响准确性?我不明白为什么帧预测不正确,但单个 JPG 和 PNG 图像预测是正确的。感谢您的帮助!
实时预测不正确的原因是因为预处理。推理代码的预处理应始终与训练时使用的预处理相同。在您的实时预测代码中使用 tf.keras.preprocessing.image.load_img 但它需要图像路径来加载图像。因此您可以按名称 "sample.png" 保存每个帧并将此路径传递给 tf.keras.preprocessing.image.load_img。这应该可以解决问题。并使用调整大小方法 "bilinear" 因为它用于训练数据