Tensorflow - 模型预测中的预处理图像

Tensorflow - Preprocessing image in model prediction

我已经使用 Functional API 和两种不同类型的预训练模型训练了一个模型:EfficientNet B5 和 MobileNet V2。在使用保存的模型进行训练后,我 运行 一个使用该模型进行一些预测的应用程序。

我对将图像传递给“model.prediction()”参数的正确方法有疑问。

型号:

    self.feature_extractor1 = EfficientNetB5(#weights='imagenet',
                                  input_shape=self.input_shape,
                                  include_top=False)

    self.feature_extractor2 = MobileNetV2(#weights='imagenet',
                                  input_shape=self.input_shape,
                                  include_top=False)


    for layer in self.feature_extractor1.layers:
        layer.trainable = False    

    for layer in self.feature_extractor2.layers:
        layer.trainable = False        
    

    input_ = Input(shape=self.input_shape)
    processed_input1 = b5_preprocess_input(input_)

    processed_input2 = mbv2_preprocess_input(input_)

    x1 = self.feature_extractor1(processed_input1)
    x1 = GlobalAveragePooling2D()(x1)
    x1 = Dropout(0.2)(x1)
    x1 = Flatten()(x1)

    x2 = self.feature_extractor2(processed_input2)
    x2 = GlobalAveragePooling2D()(x2)
    x2 = Dropout(0.2)(x2)
    x2 = Flatten()(x2)

    x = Concatenate()([x1, x2])

    x = Dense(512, activation='relu')(x) #,kernel_initializer=initializer,kernel_regularizer=regularizers.l2(0.001)) 
    x = Dense(1024, activation='relu')(x)

    output_shape = Dense(shape_categories, activation='softmax', name='shape')(x)

    model = Model(inputs=input_,
                  outputs=output_shape)
                  
    adam_kwargs = {'beta_1': 0.9, 'beta_2': 0.9, 'epsilon': 1e-7}
    sgd_kwargs = {'decay': 1e-6, 'momentum': 0.9, 'nesterov': True}
    optimizer = self.optimizers(kwargs=adam_kwargs)
    
    model.compile(loss='categorical_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])

    model.summary()

    STEP_SIZE_TRAIN = self.phase_gen[0].n// self.phase_gen[0].batch_size
    STEP_SIZE_VALID = self.phase_gen[1].n// self.phase_gen[1].batch_size
    if self.phases == 3:
        STEP_SIZE_TEST = self.phase_gen[2].n// self.phase_gen[2].batch_size

    checkpoint = ModelCheckpoint(self.model_dir,
                                monitor='val_accuracy',
                                verbose=1,
                                save_best_only=True,
                                mode='max')
    tensorboard = TensorBoard(log_dir=self.model_dir + '/logs',
                            histogram_freq=5,
                            embeddings_freq=5)
                            #[EarlyStopping(monitor='val_loss', patience=8)]
    callbacks = [checkpoint, tensorboard]

    
    hist = model.fit_generator(generator=self.phase_gen[0],
                               steps_per_epoch=STEP_SIZE_TRAIN,
                               validation_data=self.phase_gen[1],
                               validation_steps=STEP_SIZE_VALID,
                               epochs=self.epochs,
                               callbacks=callbacks
                               )

在另一个脚本中,我有预测方法:

from tensorflow.keras.applications.mobilenet_v2 import preprocess_input as mbv2_preprocess_input
from tensorflow.keras.applications.efficientnet import preprocess_input as b5_preprocess_input

def preprocess_image(img):
    img = Image.open(io.BytesIO(img))
    img = img.resize((224, 224), Image.ANTIALIAS)
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    #return [b5_preprocess_input(img),  mbv2_preprocess_input(img)]
    return [img, img]

modelSHP = get_modelSHP()

@app.route('/part_numbers', methods=['POST'])
def part_number():
    img = request.files.get('image').read()
    processed_image = preprocess_image(img)
    predict_shape = modelSHP.predict(processed_image)

我的第一个想法是我需要传递经过正确函数预处理的输入(图像),并且按照我在模型训练期间使用它的相同顺序。但是当我完成它时,我的预测准确度保持在零左右。只传图,不做任何预处理,效果更好

我将图像输入传递给 model.prediction 的方式是否正确(没有预处理)?我想知道是否使用 Functional API 和我构建模型的方式,预处理变成了每个分支模型中的一层。

我复制了你的代码,然后打印出如下所示的模型摘要

Model: "functional_5"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_23 (InputLayer)           [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
tf.math.truediv_5 (TFOpLambda)  (None, 224, 224, 3)  0           input_23[0][0]                   
__________________________________________________________________________________________________
tf.math.subtract_5 (TFOpLambda) (None, 224, 224, 3)  0           tf.math.truediv_5[0][0]          
__________________________________________________________________________________________________
efficientnetb5 (Functional)     (None, 7, 7, 2048)   28513527    input_23[0][0]                   
__________________________________________________________________________________________________
mobilenetv2_1.00_224 (Functiona (None, 7, 7, 1280)   2257984     tf.math.subtract_5[0][0]         
__________________________________________________________________________________________________
global_average_pooling2d_8 (Glo (None, 2048)         0           efficientnetb5[0][0]             
__________________________________________________________________________________________________
global_average_pooling2d_9 (Glo (None, 1280)         0           mobilenetv2_1.00_224[0][0]       
__________________________________________________________________________________________________
dropout_8 (Dropout)             (None, 2048)         0           global_average_pooling2d_8[0][0] 
__________________________________________________________________________________________________
dropout_9 (Dropout)             (None, 1280)         0           global_average_pooling2d_9[0][0] 
__________________________________________________________________________________________________
flatten_8 (Flatten)             (None, 2048)         0           dropout_8[0][0]                  
__________________________________________________________________________________________________
flatten_9 (Flatten)             (None, 1280)         0           dropout_9[0][0]                  
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 3328)         0           flatten_8[0][0]                  
                                                                 flatten_9[0][0]                  
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 512)          1704448     concatenate_3[0][0]              
__________________________________________________________________________________________________
dense_7 (Dense)                 (None, 1024)         525312      dense_6[0][0]                    
__________________________________________________________________________________________________
shape (Dense)                   (None, 2)            2050        dense_7[0][0]                    
==================================================================================================
Total params: 33,003,321
Trainable params: 2,231,810
Non-trainable params: 30,771,511

正如您假设的那样,预处理成为模型中的层。因此,对于预测,您不必像模型中内置的那样对输入进行预处理。对于 efficientNet,预处理功能只是一个传递,因为 efficientnet 期望输入像素在 0 到 255 范围内。因此在模型摘要中,您可以看到输入 (input_23) 直接馈送到 efficientnet。对于 MobileNet,预处理函数在 -1 和 +1 之间缩放像素。这是通过等式输入像素=pixel/127.5 - 1 完成的。因此层 tf.math.truediv_5 将 input_23 除以 127.5,然后层 tf.math。 subtract_5 减去 1。