如何为 Keras 模型编写基于二元交叉熵损失的条件回归损失函数
How to write a conditonal regressional loss function based on binary cross entropy loss for Keras model
我正在构建一个 key-point 人脸检测系统。目标是将面部图像输入模型,然后模型检测图像中的解剖标志(眼睛、鼻子)并输出可见标志的像素坐标。每个地标有三个目标:x、y、可见。 X 和 Y 是像素坐标,visible 是地标是否在图像中。该计划是首先在预测能见度和真实能见度之间进行二元交叉熵损失。然后,第二个损失是 x、y 坐标和目标之间的回归损失(我正在使用 MAPE)。但是,回归损失只会针对可见的地标进行计算。损失看起来像:
#Pseudo-code
def loss(y_true,y_pred):
if y_true[2] == 1
#Probability that landmark is in image
#Compute binary cross entropy loss
#Compute MAPE regression loss
Total_loss = Binary_loss + MAPE_loss
return Total_loss
else:
Total_loss = Binary loss
return Total_loss
一旦编写了损失函数,我将如何在代码中实现它?我知道如何为每个问题创建模型(检查坐标,并分别检查可见性),但我不确定如何将两个头与条件损失函数结合起来。我将如何组合层(每个头的 Conv、Flatten、Dense)以获得所需的输出?谢谢!
编辑:
我无法上传数据,但这是它的图像。前 9 列是地标的坐标和可见性。最后一列是已展平的相应图像。 当我加载数据进行训练时,这些是我执行的步骤:
###Read in data file
file = "Directory/file.csv"
train_data = pd.read_csv(file)
###Convert each coordinate column to type float64
train_data['xreye'] = train_data['xreye'].astype(np.float64)
...
###Convert image column to string type
train_data['Image'] = train_data['Image'].astype(str)
#Image is feature, other values are labels to predict later
#Image column values are strings, also some missing values, have to split
##string by space and append it and handle missing values
imag = []
for i in range(len(train_data)):
img = train_data['Image'][i].split(' ')
img = ['0' if x == '' else x for x in img]
imag.append(img)
#Reshape and convert to float value
image_list = np.array(imag,dtype = 'uint8')
X_train = image_list.reshape(-1,256,256,1)
####Get pixel coordinates and visibility targets
training = train_data[['xreye','yreye','reyev','xleye','yleye','leyev','xtsept','ytsept','tseptv']]
y_train = []
for i in range(len(train_data)):
y = training.iloc[i,:]
y_train.append(y)
y_train = np.array(y_train, dtype='float')
编辑:模型代码、损失函数和拟合方法。
###Loss function
visuals_mask = [False, False, True] * 3
def loss_func(y_true, y_pred):
visuals_true = tf.boolean_mask(y_true, visuals_mask, axis=1)
visuals_pred = tf.boolean_mask(y_pred, visuals_mask, axis=1)
visuals_loss = tf.keras.losses.BinaryCrossentropy(visuals_true, visuals_pred)
visuals_loss = tf.reduce_mean(visuals_loss)
coords_true = tf.boolean_mask(y_true, ~np.array(visuals_mask), axis=1)
coords_pred = tf.boolean_mask(y_pred, ~np.array(visuals_mask), axis=1)
coords_loss = tf.keras.losses.MeanAbsolutePercentageError(coords_true, coords_pred)
coords_loss = tf.reduce_mean(coords_loss)
return coords_loss + visuals_loss
####Model code
model = Sequential()
model.add(Conv2D(32, (3,3), activation='relu', padding='same', use_bias=False, input_shape=(256,256,1)))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(64, (3,3), activation='relu', padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(128, (3,3), activation='relu', padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(9, activation='linear'))
model.summary()
model.compile(optimizer='adam', loss=loss_func)
###Model fit
checkpointer = ModelCheckpoint('C:/Users/Cloud/.spyder-py3/x_y_shift/weights/vis_coords_TEST.hdf5', monitor='val_loss', verbose=1, mode = 'min', save_best_only=True)
out = model.fit(X_train,y_train,epochs=5,batch_size=4,validation_split=0.1, verbose=1, callbacks=[checkpointer])
我不能确定,因为我没有重现问题的数据,但这些是我头脑中的步骤:
- 使用 boolean masking 从输出中获取 2、5 和 8 索引:
visuals_mask_ = [False, False, True] * 3
# in the loss function
visuals_true = tf.boolean_mask(y_true, visuals_mask_, axis=-1) # do the same with preds
- 计算视觉损失
visuals_loss = binary_crossentropy(visuals_true, visuals_pred) # use sparse if that's the case
- 获取坐标的输出,就像我们对视觉对象所做的那样,但使用相反的
visuals_mask
。我相信 tf.boolean_mask(y_true, tf.math.logical_not(visuals_mask_, axis=-1))
应该有效。
- 为其余部分(
coords_true
和 coords_pred
)计算 MAPE
- 通过
tf.reduce_mean
获得两种损失的方法
- 获取损失总和return它
我希望这些能提供一些见解。
编辑:
我尝试了以下方法,似乎有效:
y_true = tf.convert_to_tensor(np.random.rand(32, 9))
y_pred = tf.convert_to_tensor(np.random.rand(32, 9))
visuals_mask = [False, False, True] * 3
def loss_func(y_true, y_pred):
visuals_true = tf.boolean_mask(y_true, visuals_mask, axis=1)
visuals_pred = tf.boolean_mask(y_pred, visuals_mask, axis=1)
visuals_loss = binary_crossentropy(visuals_true, visuals_pred)
visuals_loss = tf.reduce_mean(visuals_loss)
coords_true = tf.boolean_mask(y_true, ~np.array(visuals_mask), axis=1)
coords_pred = tf.boolean_mask(y_pred, ~np.array(visuals_mask), axis=1)
coords_loss = mean_absolute_percentage_error(coords_true, coords_pred)
coords_loss = tf.reduce_mean(coords_loss)
return coords_loss + visuals_loss
loss_func(y_true, y_pred)
我这里假设的是:
- 您的输出实际长度为 9 (
(batch_size, 9)
)。
- 由于eager execution。
,此演示和实际训练中的自定义损失计算可能有所不同
编辑 2:
我用这种模型试过它似乎有效:
model = Sequential()
model.add(Conv2D(4, 10, data_format='channels_last', input_shape=(256, 256, 1)))
model.add(Flatten())
model.add(Dense(9, activation='sigmoid'))
model.compile('adam', loss=loss_func)
我正在构建一个 key-point 人脸检测系统。目标是将面部图像输入模型,然后模型检测图像中的解剖标志(眼睛、鼻子)并输出可见标志的像素坐标。每个地标有三个目标:x、y、可见。 X 和 Y 是像素坐标,visible 是地标是否在图像中。该计划是首先在预测能见度和真实能见度之间进行二元交叉熵损失。然后,第二个损失是 x、y 坐标和目标之间的回归损失(我正在使用 MAPE)。但是,回归损失只会针对可见的地标进行计算。损失看起来像:
#Pseudo-code
def loss(y_true,y_pred):
if y_true[2] == 1
#Probability that landmark is in image
#Compute binary cross entropy loss
#Compute MAPE regression loss
Total_loss = Binary_loss + MAPE_loss
return Total_loss
else:
Total_loss = Binary loss
return Total_loss
一旦编写了损失函数,我将如何在代码中实现它?我知道如何为每个问题创建模型(检查坐标,并分别检查可见性),但我不确定如何将两个头与条件损失函数结合起来。我将如何组合层(每个头的 Conv、Flatten、Dense)以获得所需的输出?谢谢!
编辑:
我无法上传数据,但这是它的图像。前 9 列是地标的坐标和可见性。最后一列是已展平的相应图像。
###Read in data file
file = "Directory/file.csv"
train_data = pd.read_csv(file)
###Convert each coordinate column to type float64
train_data['xreye'] = train_data['xreye'].astype(np.float64)
...
###Convert image column to string type
train_data['Image'] = train_data['Image'].astype(str)
#Image is feature, other values are labels to predict later
#Image column values are strings, also some missing values, have to split
##string by space and append it and handle missing values
imag = []
for i in range(len(train_data)):
img = train_data['Image'][i].split(' ')
img = ['0' if x == '' else x for x in img]
imag.append(img)
#Reshape and convert to float value
image_list = np.array(imag,dtype = 'uint8')
X_train = image_list.reshape(-1,256,256,1)
####Get pixel coordinates and visibility targets
training = train_data[['xreye','yreye','reyev','xleye','yleye','leyev','xtsept','ytsept','tseptv']]
y_train = []
for i in range(len(train_data)):
y = training.iloc[i,:]
y_train.append(y)
y_train = np.array(y_train, dtype='float')
编辑:模型代码、损失函数和拟合方法。
###Loss function
visuals_mask = [False, False, True] * 3
def loss_func(y_true, y_pred):
visuals_true = tf.boolean_mask(y_true, visuals_mask, axis=1)
visuals_pred = tf.boolean_mask(y_pred, visuals_mask, axis=1)
visuals_loss = tf.keras.losses.BinaryCrossentropy(visuals_true, visuals_pred)
visuals_loss = tf.reduce_mean(visuals_loss)
coords_true = tf.boolean_mask(y_true, ~np.array(visuals_mask), axis=1)
coords_pred = tf.boolean_mask(y_pred, ~np.array(visuals_mask), axis=1)
coords_loss = tf.keras.losses.MeanAbsolutePercentageError(coords_true, coords_pred)
coords_loss = tf.reduce_mean(coords_loss)
return coords_loss + visuals_loss
####Model code
model = Sequential()
model.add(Conv2D(32, (3,3), activation='relu', padding='same', use_bias=False, input_shape=(256,256,1)))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(64, (3,3), activation='relu', padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(128, (3,3), activation='relu', padding='same', use_bias=False))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(9, activation='linear'))
model.summary()
model.compile(optimizer='adam', loss=loss_func)
###Model fit
checkpointer = ModelCheckpoint('C:/Users/Cloud/.spyder-py3/x_y_shift/weights/vis_coords_TEST.hdf5', monitor='val_loss', verbose=1, mode = 'min', save_best_only=True)
out = model.fit(X_train,y_train,epochs=5,batch_size=4,validation_split=0.1, verbose=1, callbacks=[checkpointer])
我不能确定,因为我没有重现问题的数据,但这些是我头脑中的步骤:
- 使用 boolean masking 从输出中获取 2、5 和 8 索引:
visuals_mask_ = [False, False, True] * 3
# in the loss function
visuals_true = tf.boolean_mask(y_true, visuals_mask_, axis=-1) # do the same with preds
- 计算视觉损失
visuals_loss = binary_crossentropy(visuals_true, visuals_pred) # use sparse if that's the case
- 获取坐标的输出,就像我们对视觉对象所做的那样,但使用相反的
visuals_mask
。我相信tf.boolean_mask(y_true, tf.math.logical_not(visuals_mask_, axis=-1))
应该有效。 - 为其余部分(
coords_true
和coords_pred
)计算 MAPE - 通过
tf.reduce_mean
获得两种损失的方法
- 获取损失总和return它
我希望这些能提供一些见解。
编辑: 我尝试了以下方法,似乎有效:
y_true = tf.convert_to_tensor(np.random.rand(32, 9))
y_pred = tf.convert_to_tensor(np.random.rand(32, 9))
visuals_mask = [False, False, True] * 3
def loss_func(y_true, y_pred):
visuals_true = tf.boolean_mask(y_true, visuals_mask, axis=1)
visuals_pred = tf.boolean_mask(y_pred, visuals_mask, axis=1)
visuals_loss = binary_crossentropy(visuals_true, visuals_pred)
visuals_loss = tf.reduce_mean(visuals_loss)
coords_true = tf.boolean_mask(y_true, ~np.array(visuals_mask), axis=1)
coords_pred = tf.boolean_mask(y_pred, ~np.array(visuals_mask), axis=1)
coords_loss = mean_absolute_percentage_error(coords_true, coords_pred)
coords_loss = tf.reduce_mean(coords_loss)
return coords_loss + visuals_loss
loss_func(y_true, y_pred)
我这里假设的是:
- 您的输出实际长度为 9 (
(batch_size, 9)
)。 - 由于eager execution。 ,此演示和实际训练中的自定义损失计算可能有所不同
编辑 2: 我用这种模型试过它似乎有效:
model = Sequential()
model.add(Conv2D(4, 10, data_format='channels_last', input_shape=(256, 256, 1)))
model.add(Flatten())
model.add(Dense(9, activation='sigmoid'))
model.compile('adam', loss=loss_func)