如何正确实现多任务分类 Keras 模型?

How to implement a multi-task classification Keras model properly?

我有一个模型可以在两个相关数据集中对 类 进行分类。第一个数据集有 4 类,第二个数据集有 2 类。这是我实现的模型

def  build_model():

    branch_A_input = tf.keras.Input(shape=IMG_SHAPE)
    branch_A_rescale = tf.keras.layers.experimental.preprocessing.Rescaling(1./127.5, offset= -1)(branch_A_input)
    branch_A = tf.keras.layers.Dropout(0.3)(branch_A_rescale)
    branch_A = tf.keras.layers.Conv2D(filters = 128, kernel_size = 13, activation= 'swish', name = "base_conv_A")(branch_A)
    branch_A = tf.keras.layers.BatchNormalization(name = "base_batch_normalization_A")(branch_A)
    branch_A = tf.keras.Model(inputs=branch_A_input, outputs = branch_A)


    branch_B_input = tf.keras.Input(shape=IMG_SHAPE)
    branch_B_rescale = tf.keras.layers.experimental.preprocessing.Rescaling(1./127.5, offset= -1)(branch_B_input)
    branch_B = tf.keras.layers.Dropout(0.3)(branch_B_rescale)
    branch_B = tf.keras.layers.Conv2D(filters = 128, kernel_size = 13, activation= 'swish', name = "base_conv_B")(branch_B)
    branch_B = tf.keras.layers.BatchNormalization(name = "base_batch_normalization_B")(branch_B)
    branch_B = tf.keras.Model(inputs=branch_B_input, outputs = branch_B)

    merge = concatenate([branch_A.output, branch_B.output])

    output_A = tf.keras.layers.Dense(tsk1_CLASSES_NUM, activation='softmax', name='4cls')(merge)
    output_B= tf.keras.layers.Dense(1, name='2cls')(merge)

    model = tf.keras.Model(inputs = [branch_A.input, branch_B.input] , outputs = [output_A, output_B], name="multi_task_model")
    
    optimizer = tf.keras.optimizers.get('adam')
    optimizer.learning_rate = 0.001
    losses = {'4cls': tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              '2cls': tf.keras.losses.BinaryCrossentropy(from_logits=True)}
 
    mtrcs = {
    "4cls": 'accuracy',
    "2cls": 'accuracy'
    }
    model.compile(optimizer=optimizer,
                  loss= losses, 
                  metrics=mtrcs)
    
    return model

model = build_model()

当我尝试使用

训练模型时
history = model.fit([tsk1_train_ds,tsk2_train_ds],
                    epochs=initial_epochs)

我收到错误消息

ValueError: Failed to find data adapter that can handle input: (<class 'list'> containing values of types {"<class 'tensorflow.python.data.ops.dataset_ops.PrefetchDataset'>"}), <class 'NoneType'>

当我尝试时

tsk1_image_batch, tsk1_label_batch = next(iter(tsk1_train_ds))
tsk2_image_batch, tsk2_label_batch = next(iter(tsk2_train_ds))

history = model.fit(x = [tsk1_image_batch, tsk2_image_batch],
                    y = [tsk1_label_batch, tsk2_label_batch],
                    epochs=initial_epochs)

我收到错误消息

TypeError: 'NoneType' object is not callable

tsk1_image_batchtsk2_image_batch 中的数据类似于:

<tf.Tensor: shape=(8, 166, 166, 3), dtype=float32, numpy=
array([[[[ 38.996986  ,  38.996986  ,  38.996986  ],
         [ 73.22591   ,  73.22591   ,  73.22591   ],
         [ 85.44268   ,  85.44268   ,  85.44268   ],
         ...,
         [ 85.927734  ,  85.927734  ,  85.927734  ],
         [ 75.0845    ,  75.0845    ,  75.0845    ],
         [ 60.244205  ,  60.244205  ,  60.244205  ]],

        [[  9.421633  ,   9.421633  ,   9.421633  ],
         [ 53.4908    ,  53.4908    ,  53.4908    ],
         [ 64.668945  ,  64.668945  ,  64.668945  ],
         ...,
         [ 82.186516  ,  82.186516  ,  82.186516  ],
         [ 69.15674   ,  69.15674   ,  69.15674   ],
         [ 59.0754    ,  59.0754    ,  59.0754    ]],

tsk1_label_batch 中的数据如下所示:

 <tf.Tensor: shape=(8,), dtype=int64, numpy=array([0, 2, 3, 2, 3, 3, 0, 0], dtype=int64)>

tsk2_label_batch 中的数据如下所示:

<tf.Tensor: shape=(8,), dtype=int64, numpy=array([0, 1, 1, 1, 1, 0, 1, 0], dtype=int64)>

我不确定我错过了什么。感谢您的帮助。

编辑

从这个answer看来,我在使用softmax时似乎不应该使用from_logits=True。因此我更新了相关代码:

losses = {'4cls': tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
          '2cls': tf.keras.losses.BinaryCrossentropy(from_logits=True)}

现在我得到了错误

ValueError: `logits` and `labels` must have the same shape, received ((None, 154, 154, 1) vs (None,)).

根据评论 marco cerliani on this question,我应该使用 Flatten 或 GlobalPooling 层。我使用了 GlobalAveragePooling2D 层:

代码应该是这样的:

merge = concatenate([branch_A.output, branch_B.output])
gap = tf.keras.layers.GlobalAveragePooling2D()(merge)

output_A = tf.keras.layers.Dense(OCT_CLASSES_NUM, activation='softmax', name='4cls')(gap)
output_B= tf.keras.layers.Dense(1,activation='sigmoid', name='2cls')(gap)