Tensorflow - 模型预测中的预处理图像
Tensorflow - Preprocessing image in model prediction
我已经使用 Functional API 和两种不同类型的预训练模型训练了一个模型:EfficientNet B5 和 MobileNet V2。在使用保存的模型进行训练后,我 运行 一个使用该模型进行一些预测的应用程序。
我对将图像传递给“model.prediction()”参数的正确方法有疑问。
型号:
self.feature_extractor1 = EfficientNetB5(#weights='imagenet',
input_shape=self.input_shape,
include_top=False)
self.feature_extractor2 = MobileNetV2(#weights='imagenet',
input_shape=self.input_shape,
include_top=False)
for layer in self.feature_extractor1.layers:
layer.trainable = False
for layer in self.feature_extractor2.layers:
layer.trainable = False
input_ = Input(shape=self.input_shape)
processed_input1 = b5_preprocess_input(input_)
processed_input2 = mbv2_preprocess_input(input_)
x1 = self.feature_extractor1(processed_input1)
x1 = GlobalAveragePooling2D()(x1)
x1 = Dropout(0.2)(x1)
x1 = Flatten()(x1)
x2 = self.feature_extractor2(processed_input2)
x2 = GlobalAveragePooling2D()(x2)
x2 = Dropout(0.2)(x2)
x2 = Flatten()(x2)
x = Concatenate()([x1, x2])
x = Dense(512, activation='relu')(x) #,kernel_initializer=initializer,kernel_regularizer=regularizers.l2(0.001))
x = Dense(1024, activation='relu')(x)
output_shape = Dense(shape_categories, activation='softmax', name='shape')(x)
model = Model(inputs=input_,
outputs=output_shape)
adam_kwargs = {'beta_1': 0.9, 'beta_2': 0.9, 'epsilon': 1e-7}
sgd_kwargs = {'decay': 1e-6, 'momentum': 0.9, 'nesterov': True}
optimizer = self.optimizers(kwargs=adam_kwargs)
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
model.summary()
STEP_SIZE_TRAIN = self.phase_gen[0].n// self.phase_gen[0].batch_size
STEP_SIZE_VALID = self.phase_gen[1].n// self.phase_gen[1].batch_size
if self.phases == 3:
STEP_SIZE_TEST = self.phase_gen[2].n// self.phase_gen[2].batch_size
checkpoint = ModelCheckpoint(self.model_dir,
monitor='val_accuracy',
verbose=1,
save_best_only=True,
mode='max')
tensorboard = TensorBoard(log_dir=self.model_dir + '/logs',
histogram_freq=5,
embeddings_freq=5)
#[EarlyStopping(monitor='val_loss', patience=8)]
callbacks = [checkpoint, tensorboard]
hist = model.fit_generator(generator=self.phase_gen[0],
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=self.phase_gen[1],
validation_steps=STEP_SIZE_VALID,
epochs=self.epochs,
callbacks=callbacks
)
在另一个脚本中,我有预测方法:
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input as mbv2_preprocess_input
from tensorflow.keras.applications.efficientnet import preprocess_input as b5_preprocess_input
def preprocess_image(img):
img = Image.open(io.BytesIO(img))
img = img.resize((224, 224), Image.ANTIALIAS)
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
#return [b5_preprocess_input(img), mbv2_preprocess_input(img)]
return [img, img]
modelSHP = get_modelSHP()
@app.route('/part_numbers', methods=['POST'])
def part_number():
img = request.files.get('image').read()
processed_image = preprocess_image(img)
predict_shape = modelSHP.predict(processed_image)
我的第一个想法是我需要传递经过正确函数预处理的输入(图像),并且按照我在模型训练期间使用它的相同顺序。但是当我完成它时,我的预测准确度保持在零左右。只传图,不做任何预处理,效果更好
我将图像输入传递给 model.prediction 的方式是否正确(没有预处理)?我想知道是否使用 Functional API 和我构建模型的方式,预处理变成了每个分支模型中的一层。
我复制了你的代码,然后打印出如下所示的模型摘要
Model: "functional_5"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_23 (InputLayer) [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
tf.math.truediv_5 (TFOpLambda) (None, 224, 224, 3) 0 input_23[0][0]
__________________________________________________________________________________________________
tf.math.subtract_5 (TFOpLambda) (None, 224, 224, 3) 0 tf.math.truediv_5[0][0]
__________________________________________________________________________________________________
efficientnetb5 (Functional) (None, 7, 7, 2048) 28513527 input_23[0][0]
__________________________________________________________________________________________________
mobilenetv2_1.00_224 (Functiona (None, 7, 7, 1280) 2257984 tf.math.subtract_5[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_8 (Glo (None, 2048) 0 efficientnetb5[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_9 (Glo (None, 1280) 0 mobilenetv2_1.00_224[0][0]
__________________________________________________________________________________________________
dropout_8 (Dropout) (None, 2048) 0 global_average_pooling2d_8[0][0]
__________________________________________________________________________________________________
dropout_9 (Dropout) (None, 1280) 0 global_average_pooling2d_9[0][0]
__________________________________________________________________________________________________
flatten_8 (Flatten) (None, 2048) 0 dropout_8[0][0]
__________________________________________________________________________________________________
flatten_9 (Flatten) (None, 1280) 0 dropout_9[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 3328) 0 flatten_8[0][0]
flatten_9[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 512) 1704448 concatenate_3[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 1024) 525312 dense_6[0][0]
__________________________________________________________________________________________________
shape (Dense) (None, 2) 2050 dense_7[0][0]
==================================================================================================
Total params: 33,003,321
Trainable params: 2,231,810
Non-trainable params: 30,771,511
正如您假设的那样,预处理成为模型中的层。因此,对于预测,您不必像模型中内置的那样对输入进行预处理。对于 efficientNet,预处理功能只是一个传递,因为 efficientnet 期望输入像素在 0 到 255 范围内。因此在模型摘要中,您可以看到输入 (input_23) 直接馈送到 efficientnet。对于 MobileNet,预处理函数在 -1 和 +1 之间缩放像素。这是通过等式输入像素=pixel/127.5 - 1 完成的。因此层 tf.math.truediv_5 将 input_23 除以 127.5,然后层 tf.math。 subtract_5 减去 1。
我已经使用 Functional API 和两种不同类型的预训练模型训练了一个模型:EfficientNet B5 和 MobileNet V2。在使用保存的模型进行训练后,我 运行 一个使用该模型进行一些预测的应用程序。
我对将图像传递给“model.prediction()”参数的正确方法有疑问。
型号:
self.feature_extractor1 = EfficientNetB5(#weights='imagenet',
input_shape=self.input_shape,
include_top=False)
self.feature_extractor2 = MobileNetV2(#weights='imagenet',
input_shape=self.input_shape,
include_top=False)
for layer in self.feature_extractor1.layers:
layer.trainable = False
for layer in self.feature_extractor2.layers:
layer.trainable = False
input_ = Input(shape=self.input_shape)
processed_input1 = b5_preprocess_input(input_)
processed_input2 = mbv2_preprocess_input(input_)
x1 = self.feature_extractor1(processed_input1)
x1 = GlobalAveragePooling2D()(x1)
x1 = Dropout(0.2)(x1)
x1 = Flatten()(x1)
x2 = self.feature_extractor2(processed_input2)
x2 = GlobalAveragePooling2D()(x2)
x2 = Dropout(0.2)(x2)
x2 = Flatten()(x2)
x = Concatenate()([x1, x2])
x = Dense(512, activation='relu')(x) #,kernel_initializer=initializer,kernel_regularizer=regularizers.l2(0.001))
x = Dense(1024, activation='relu')(x)
output_shape = Dense(shape_categories, activation='softmax', name='shape')(x)
model = Model(inputs=input_,
outputs=output_shape)
adam_kwargs = {'beta_1': 0.9, 'beta_2': 0.9, 'epsilon': 1e-7}
sgd_kwargs = {'decay': 1e-6, 'momentum': 0.9, 'nesterov': True}
optimizer = self.optimizers(kwargs=adam_kwargs)
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
model.summary()
STEP_SIZE_TRAIN = self.phase_gen[0].n// self.phase_gen[0].batch_size
STEP_SIZE_VALID = self.phase_gen[1].n// self.phase_gen[1].batch_size
if self.phases == 3:
STEP_SIZE_TEST = self.phase_gen[2].n// self.phase_gen[2].batch_size
checkpoint = ModelCheckpoint(self.model_dir,
monitor='val_accuracy',
verbose=1,
save_best_only=True,
mode='max')
tensorboard = TensorBoard(log_dir=self.model_dir + '/logs',
histogram_freq=5,
embeddings_freq=5)
#[EarlyStopping(monitor='val_loss', patience=8)]
callbacks = [checkpoint, tensorboard]
hist = model.fit_generator(generator=self.phase_gen[0],
steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=self.phase_gen[1],
validation_steps=STEP_SIZE_VALID,
epochs=self.epochs,
callbacks=callbacks
)
在另一个脚本中,我有预测方法:
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input as mbv2_preprocess_input
from tensorflow.keras.applications.efficientnet import preprocess_input as b5_preprocess_input
def preprocess_image(img):
img = Image.open(io.BytesIO(img))
img = img.resize((224, 224), Image.ANTIALIAS)
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
#return [b5_preprocess_input(img), mbv2_preprocess_input(img)]
return [img, img]
modelSHP = get_modelSHP()
@app.route('/part_numbers', methods=['POST'])
def part_number():
img = request.files.get('image').read()
processed_image = preprocess_image(img)
predict_shape = modelSHP.predict(processed_image)
我的第一个想法是我需要传递经过正确函数预处理的输入(图像),并且按照我在模型训练期间使用它的相同顺序。但是当我完成它时,我的预测准确度保持在零左右。只传图,不做任何预处理,效果更好
我将图像输入传递给 model.prediction 的方式是否正确(没有预处理)?我想知道是否使用 Functional API 和我构建模型的方式,预处理变成了每个分支模型中的一层。
我复制了你的代码,然后打印出如下所示的模型摘要
Model: "functional_5"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_23 (InputLayer) [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
tf.math.truediv_5 (TFOpLambda) (None, 224, 224, 3) 0 input_23[0][0]
__________________________________________________________________________________________________
tf.math.subtract_5 (TFOpLambda) (None, 224, 224, 3) 0 tf.math.truediv_5[0][0]
__________________________________________________________________________________________________
efficientnetb5 (Functional) (None, 7, 7, 2048) 28513527 input_23[0][0]
__________________________________________________________________________________________________
mobilenetv2_1.00_224 (Functiona (None, 7, 7, 1280) 2257984 tf.math.subtract_5[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_8 (Glo (None, 2048) 0 efficientnetb5[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_9 (Glo (None, 1280) 0 mobilenetv2_1.00_224[0][0]
__________________________________________________________________________________________________
dropout_8 (Dropout) (None, 2048) 0 global_average_pooling2d_8[0][0]
__________________________________________________________________________________________________
dropout_9 (Dropout) (None, 1280) 0 global_average_pooling2d_9[0][0]
__________________________________________________________________________________________________
flatten_8 (Flatten) (None, 2048) 0 dropout_8[0][0]
__________________________________________________________________________________________________
flatten_9 (Flatten) (None, 1280) 0 dropout_9[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 3328) 0 flatten_8[0][0]
flatten_9[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 512) 1704448 concatenate_3[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 1024) 525312 dense_6[0][0]
__________________________________________________________________________________________________
shape (Dense) (None, 2) 2050 dense_7[0][0]
==================================================================================================
Total params: 33,003,321
Trainable params: 2,231,810
Non-trainable params: 30,771,511
正如您假设的那样,预处理成为模型中的层。因此,对于预测,您不必像模型中内置的那样对输入进行预处理。对于 efficientNet,预处理功能只是一个传递,因为 efficientnet 期望输入像素在 0 到 255 范围内。因此在模型摘要中,您可以看到输入 (input_23) 直接馈送到 efficientnet。对于 MobileNet,预处理函数在 -1 和 +1 之间缩放像素。这是通过等式输入像素=pixel/127.5 - 1 完成的。因此层 tf.math.truediv_5 将 input_23 除以 127.5,然后层 tf.math。 subtract_5 减去 1。