在 keras 中同时训练神经网络并让它们在训练时共同分担损失?
Training neural nets simultaneously in keras and have them share losses jointly while training?
假设我想同时训练三个模型(模型 1、模型 2 和模型 3),并且在训练时模型一和模型二与主网络(模型 1)共同分担损失。因此主模型可以从层与层之间的其他两个模型中学习表示。
总损失 = (weight1)loss m1 + (weight2)(loss m1 - loss m2) + (weight3)(loss m1 - loss m3)
到目前为止我有以下内容:
def threemodel(num_nodes, num_class, w1, w2, w3):
#w1; w2; w3 are loss weights
in1 = Input((6373,))
enc1 = Dense(num_nodes)(in1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
out1 = Dense(units=num_class, activation='softmax')(enc1)
in2 = Input((512,))
enc2 = Dense(num_nodes, activation='relu')(in2)
enc2 = Dense(num_nodes, activation='relu')(enc2)
out2 = Dense(units=num_class, activation='softmax')(enc2)
in3 = Input((768,))
enc3 = Dense(num_nodes, activation='relu')(in3)
enc3 = Dense(num_nodes, activation='relu')(enc3)
out3 = Dense(units=num_class, activation='softmax')(enc3)
adam = Adam(lr=0.0001)
model = Model(inputs=[in1, in2, in3], outputs=[out1, out2, out3])
model.compile(loss='categorical_crossentropy', #continu together
optimizer='adam',
metrics=['accuracy'] not sure know what changes need to be made here)
## I am confused on how to formulate the shared losses equation here to share the losses of out2 and out3 with out1.
稍作搜索后,似乎可以执行以下操作:
loss_1 = tf.keras.losses.categorical_crossentropy(y_true_1, out1)
loss_2 = tf.keras.losses.categorical_crossentropy(y_true_2, out2)
loss_3 = tf.keras.losses.categorical_crossentropy(y_true_3, out3)
model.add_loss((w1)*loss_1 + (w2)*(loss_1 - loss_2) + (w3)*(loss_1 - loss_3))
这行得通吗?我觉得按照我上面的建议做并没有真正做我想做的事情,即让主模型 (mod1) 从层之间的其他两个模型 (mod2 和 mod3) 学习表示。
有什么建议吗?
由于您对使用可训练权重不感兴趣(我将它们标记为系数以将它们与可训练权重区分开来),您可以连接输出并将它们作为单个输出传递给自定义损失函数。这意味着这些系数将在训练开始时可用。
您应该提供一个自定义损失函数,如前所述。损失函数预计仅采用 2 个参数,因此您应该使用这样的函数,又名 categorical_crossentropy
,它也应该熟悉您感兴趣的参数,例如 coeffs
和 num_class
。所以我用我想要的参数实例化一个包装函数,然后将内部实际损失函数作为主要损失函数传递。
from tensorflow.keras.layers import Dense, Dropout, Input, Concatenate
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.python.framework import ops
from tensorflow.python.framework import smart_cond
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import array_ops
from tensorflow.python.keras import backend as K
def categorical_crossentropy_base(coeffs, num_class):
def categorical_crossentropy(y_true, y_pred, from_logits=False, label_smoothing=0):
"""Computes the categorical crossentropy loss.
Args:
y_true: tensor of true targets.
y_pred: tensor of predicted targets.
from_logits: Whether `y_pred` is expected to be a logits tensor. By default,
we assume that `y_pred` encodes a probability distribution.
label_smoothing: Float in [0, 1]. If > `0` then smooth the labels.
Returns:
Categorical crossentropy loss value.
https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/keras/losses.py#L938-L966
"""
y_pred1 = y_pred[:, :num_class] # the 1st prediction
y_pred2 = y_pred[:, num_class:2*num_class] # the 2nd prediction
y_pred3 = y_pred[:, 2*num_class:] # the 3rd prediction
# you should adapt the ground truth to contain all 3 ground truth of course
y_true1 = y_true[:, :num_class] # the 1st gt
y_true2 = y_true[:, num_class:2*num_class] # the 2nd gt
y_true3 = y_true[:, 2*num_class:] # the 3rd gt
loss1 = K.categorical_crossentropy(y_true1, y_pred1, from_logits=from_logits)
loss2 = K.categorical_crossentropy(y_true2, y_pred2, from_logits=from_logits)
loss3 = K.categorical_crossentropy(y_true3, y_pred3, from_logits=from_logits)
# combine the losses the way you like it
total_loss = coeffs[0]*loss1 + coeffs[1]*(loss1 - loss2) + coeffs[2]*(loss2 - loss3)
return total_loss
return categorical_crossentropy
in1 = Input((6373,))
enc1 = Dense(num_nodes)(in1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
out1 = Dense(units=num_class, activation='softmax')(enc1)
in2 = Input((512,))
enc2 = Dense(num_nodes, activation='relu')(in2)
enc2 = Dense(num_nodes, activation='relu')(enc2)
out2 = Dense(units=num_class, activation='softmax')(enc2)
in3 = Input((768,))
enc3 = Dense(num_nodes, activation='relu')(in3)
enc3 = Dense(num_nodes, activation='relu')(enc3)
out3 = Dense(units=num_class, activation='softmax')(enc3)
adam = Adam(lr=0.0001)
total_out = Concatenate(axis=1)([out1, out2, out3])
model = Model(inputs=[in1, in2, in3], outputs=[total_out])
coeffs = [1, 1, 1]
model.compile(loss=categorical_crossentropy_base(coeffs=coeffs, num_class=num_class), optimizer='adam', metrics=['accuracy'])
虽然我不确定有关准确性的指标。但我认为它会在没有其他变化的情况下工作。我也在使用 K.categorical_crossentropy
但您当然也可以使用其他实现自由更改它。
假设我想同时训练三个模型(模型 1、模型 2 和模型 3),并且在训练时模型一和模型二与主网络(模型 1)共同分担损失。因此主模型可以从层与层之间的其他两个模型中学习表示。
总损失 = (weight1)loss m1 + (weight2)(loss m1 - loss m2) + (weight3)(loss m1 - loss m3)
到目前为止我有以下内容:
def threemodel(num_nodes, num_class, w1, w2, w3):
#w1; w2; w3 are loss weights
in1 = Input((6373,))
enc1 = Dense(num_nodes)(in1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
out1 = Dense(units=num_class, activation='softmax')(enc1)
in2 = Input((512,))
enc2 = Dense(num_nodes, activation='relu')(in2)
enc2 = Dense(num_nodes, activation='relu')(enc2)
out2 = Dense(units=num_class, activation='softmax')(enc2)
in3 = Input((768,))
enc3 = Dense(num_nodes, activation='relu')(in3)
enc3 = Dense(num_nodes, activation='relu')(enc3)
out3 = Dense(units=num_class, activation='softmax')(enc3)
adam = Adam(lr=0.0001)
model = Model(inputs=[in1, in2, in3], outputs=[out1, out2, out3])
model.compile(loss='categorical_crossentropy', #continu together
optimizer='adam',
metrics=['accuracy'] not sure know what changes need to be made here)
## I am confused on how to formulate the shared losses equation here to share the losses of out2 and out3 with out1.
稍作搜索后,似乎可以执行以下操作:
loss_1 = tf.keras.losses.categorical_crossentropy(y_true_1, out1)
loss_2 = tf.keras.losses.categorical_crossentropy(y_true_2, out2)
loss_3 = tf.keras.losses.categorical_crossentropy(y_true_3, out3)
model.add_loss((w1)*loss_1 + (w2)*(loss_1 - loss_2) + (w3)*(loss_1 - loss_3))
这行得通吗?我觉得按照我上面的建议做并没有真正做我想做的事情,即让主模型 (mod1) 从层之间的其他两个模型 (mod2 和 mod3) 学习表示。 有什么建议吗?
由于您对使用可训练权重不感兴趣(我将它们标记为系数以将它们与可训练权重区分开来),您可以连接输出并将它们作为单个输出传递给自定义损失函数。这意味着这些系数将在训练开始时可用。
您应该提供一个自定义损失函数,如前所述。损失函数预计仅采用 2 个参数,因此您应该使用这样的函数,又名 categorical_crossentropy
,它也应该熟悉您感兴趣的参数,例如 coeffs
和 num_class
。所以我用我想要的参数实例化一个包装函数,然后将内部实际损失函数作为主要损失函数传递。
from tensorflow.keras.layers import Dense, Dropout, Input, Concatenate
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
from tensorflow.python.framework import ops
from tensorflow.python.framework import smart_cond
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import array_ops
from tensorflow.python.keras import backend as K
def categorical_crossentropy_base(coeffs, num_class):
def categorical_crossentropy(y_true, y_pred, from_logits=False, label_smoothing=0):
"""Computes the categorical crossentropy loss.
Args:
y_true: tensor of true targets.
y_pred: tensor of predicted targets.
from_logits: Whether `y_pred` is expected to be a logits tensor. By default,
we assume that `y_pred` encodes a probability distribution.
label_smoothing: Float in [0, 1]. If > `0` then smooth the labels.
Returns:
Categorical crossentropy loss value.
https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/keras/losses.py#L938-L966
"""
y_pred1 = y_pred[:, :num_class] # the 1st prediction
y_pred2 = y_pred[:, num_class:2*num_class] # the 2nd prediction
y_pred3 = y_pred[:, 2*num_class:] # the 3rd prediction
# you should adapt the ground truth to contain all 3 ground truth of course
y_true1 = y_true[:, :num_class] # the 1st gt
y_true2 = y_true[:, num_class:2*num_class] # the 2nd gt
y_true3 = y_true[:, 2*num_class:] # the 3rd gt
loss1 = K.categorical_crossentropy(y_true1, y_pred1, from_logits=from_logits)
loss2 = K.categorical_crossentropy(y_true2, y_pred2, from_logits=from_logits)
loss3 = K.categorical_crossentropy(y_true3, y_pred3, from_logits=from_logits)
# combine the losses the way you like it
total_loss = coeffs[0]*loss1 + coeffs[1]*(loss1 - loss2) + coeffs[2]*(loss2 - loss3)
return total_loss
return categorical_crossentropy
in1 = Input((6373,))
enc1 = Dense(num_nodes)(in1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
out1 = Dense(units=num_class, activation='softmax')(enc1)
in2 = Input((512,))
enc2 = Dense(num_nodes, activation='relu')(in2)
enc2 = Dense(num_nodes, activation='relu')(enc2)
out2 = Dense(units=num_class, activation='softmax')(enc2)
in3 = Input((768,))
enc3 = Dense(num_nodes, activation='relu')(in3)
enc3 = Dense(num_nodes, activation='relu')(enc3)
out3 = Dense(units=num_class, activation='softmax')(enc3)
adam = Adam(lr=0.0001)
total_out = Concatenate(axis=1)([out1, out2, out3])
model = Model(inputs=[in1, in2, in3], outputs=[total_out])
coeffs = [1, 1, 1]
model.compile(loss=categorical_crossentropy_base(coeffs=coeffs, num_class=num_class), optimizer='adam', metrics=['accuracy'])
虽然我不确定有关准确性的指标。但我认为它会在没有其他变化的情况下工作。我也在使用 K.categorical_crossentropy
但您当然也可以使用其他实现自由更改它。