具有多个输出的网络,如何计算损失?
Networks with multiple outputs, how the loss is computed?
当训练一个具有多个分支的网络时,因此会有多个损失,keras 描述中提到全局损失是两个部分损失的加权和,即 final_loss = l1*loss1 + l2*loss2
然而,在我的模型由两个分支组成的训练过程中,并使用两个分支的分类交叉熵损失进行编译,选项为 loss_weights=[1., 1.]。我希望将全局损失视为两个损失的平均值(因为两个部分损失的权重相等),但事实并非如此。我得到了一个相对较高的全局损失,我无法猜测它是如何使用部分损失及其权重来计算的。以下是一些训练值。谁能向我解释一下如何使用这些参数计算全局损失?并且损失权重的总和是否应该不超过 1(即我应该使用 loss_weights=[0.5, 0.5] 来代替吗?)
被封了好久了,如果能帮上忙,我会非常感谢。
Epoch 2/200
26/26 [==============================] - 39s 1s/step - loss: 9.2902 -
dense_1_loss: 0.0801 - dense_2_loss: 0.0717 -
Epoch 3/200
26/26 [==============================] - 39s 1s/step - loss: 8.2261 -
dense_1_loss: 0.0251 - dense_2_loss: 0.0199 -
Epoch 4/200
26/26 [==============================] - 39s 2s/step - loss: 7.3107 -
dense_1_loss: 0.0595 - dense_2_loss: 0.0048 -
Epoch 5/200
26/26 [==============================] - 39s 1s/step - loss: 6.4586 -
dense_1_loss: 0.0560 - dense_2_loss: 0.0025 -
Epoch 6/200
26/26 [==============================] - 39s 1s/step - loss: 5.9463 -
dense_1_loss: 0.1964 - dense_2_loss: 0.0653 -
Epoch 7/200
26/26 [==============================] - 39s 1s/step - loss: 5.3730 -
dense_1_loss: 0.1722 - dense_2_loss: 0.0447 -
Epoch 8/200
26/26 [==============================] - 39s 1s/step - loss: 4.8407 -
dense_1_loss: 0.1396 - dense_2_loss: 0.0169 -
Epoch 9/200
26/26 [==============================] - 39s 1s/step - loss: 4.4465 -
dense_1_loss: 0.1614 - dense_2_loss: 0.0124 -
Epoch 10/200
26/26 [==============================] - 39s 2s/step - loss: 3.9898 -
dense_1_loss: 0.0588 - dense_2_loss: 0.0119 -
Epoch 11/200
26/26 [==============================] - 39s 1s/step - loss: 3.6347 -
dense_1_loss: 0.0302 - dense_2_loss: 0.0085 -
正确。全局损失是两个部分损失的加权和
Global loss=(loss1 * weight1 + loss2 * weight2)
我采用了 keras 函数模型来证明全局损失是两个部分损失的加权和。请查看整个代码 here.
模型编译为
model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
loss=[keras.losses.BinaryCrossentropy(from_logits=True),
keras.losses.CategoricalCrossentropy(from_logits=True)],
loss_weights=[1., 0.2])
训练为
的模型
model.fit({'title': title_data, 'body': body_data, 'tags': tags_data},
{'priority': priority_targets, 'department': dept_targets},
epochs=2,batch_size=32)
Epoch 1/2
40/40 [==============================] - 2s 45ms/step - loss: 1.2723 - priority_loss: 0.7062 - department_loss: 2.8304
Epoch 2/2
40/40 [==============================] - 2s 46ms/step - loss: 1.2593 - priority_loss: 0.6995 - department_loss: 2.7993
查看权重和二次损失如何得到整体损失
(loss1*weight1+loss2*weight2)
(0.7062*1.0+2.8304*0.2)#1.27228
希望这对您有所帮助。
当训练一个具有多个分支的网络时,因此会有多个损失,keras 描述中提到全局损失是两个部分损失的加权和,即 final_loss = l1*loss1 + l2*loss2
然而,在我的模型由两个分支组成的训练过程中,并使用两个分支的分类交叉熵损失进行编译,选项为 loss_weights=[1., 1.]。我希望将全局损失视为两个损失的平均值(因为两个部分损失的权重相等),但事实并非如此。我得到了一个相对较高的全局损失,我无法猜测它是如何使用部分损失及其权重来计算的。以下是一些训练值。谁能向我解释一下如何使用这些参数计算全局损失?并且损失权重的总和是否应该不超过 1(即我应该使用 loss_weights=[0.5, 0.5] 来代替吗?) 被封了好久了,如果能帮上忙,我会非常感谢。
Epoch 2/200
26/26 [==============================] - 39s 1s/step - loss: 9.2902 -
dense_1_loss: 0.0801 - dense_2_loss: 0.0717 -
Epoch 3/200
26/26 [==============================] - 39s 1s/step - loss: 8.2261 -
dense_1_loss: 0.0251 - dense_2_loss: 0.0199 -
Epoch 4/200
26/26 [==============================] - 39s 2s/step - loss: 7.3107 -
dense_1_loss: 0.0595 - dense_2_loss: 0.0048 -
Epoch 5/200
26/26 [==============================] - 39s 1s/step - loss: 6.4586 -
dense_1_loss: 0.0560 - dense_2_loss: 0.0025 -
Epoch 6/200
26/26 [==============================] - 39s 1s/step - loss: 5.9463 -
dense_1_loss: 0.1964 - dense_2_loss: 0.0653 -
Epoch 7/200
26/26 [==============================] - 39s 1s/step - loss: 5.3730 -
dense_1_loss: 0.1722 - dense_2_loss: 0.0447 -
Epoch 8/200
26/26 [==============================] - 39s 1s/step - loss: 4.8407 -
dense_1_loss: 0.1396 - dense_2_loss: 0.0169 -
Epoch 9/200
26/26 [==============================] - 39s 1s/step - loss: 4.4465 -
dense_1_loss: 0.1614 - dense_2_loss: 0.0124 -
Epoch 10/200
26/26 [==============================] - 39s 2s/step - loss: 3.9898 -
dense_1_loss: 0.0588 - dense_2_loss: 0.0119 -
Epoch 11/200
26/26 [==============================] - 39s 1s/step - loss: 3.6347 -
dense_1_loss: 0.0302 - dense_2_loss: 0.0085 -
正确。全局损失是两个部分损失的加权和
Global loss=(loss1 * weight1 + loss2 * weight2)
我采用了 keras 函数模型来证明全局损失是两个部分损失的加权和。请查看整个代码 here.
模型编译为
model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
loss=[keras.losses.BinaryCrossentropy(from_logits=True),
keras.losses.CategoricalCrossentropy(from_logits=True)],
loss_weights=[1., 0.2])
训练为
的模型model.fit({'title': title_data, 'body': body_data, 'tags': tags_data},
{'priority': priority_targets, 'department': dept_targets},
epochs=2,batch_size=32)
Epoch 1/2
40/40 [==============================] - 2s 45ms/step - loss: 1.2723 - priority_loss: 0.7062 - department_loss: 2.8304
Epoch 2/2
40/40 [==============================] - 2s 46ms/step - loss: 1.2593 - priority_loss: 0.6995 - department_loss: 2.7993
查看权重和二次损失如何得到整体损失 (loss1*weight1+loss2*weight2) (0.7062*1.0+2.8304*0.2)#1.27228
希望这对您有所帮助。