具有样本权重的 Keras 中的自定义损失函数

Custom Loss Function in Keras with Sample Weights

我是 Tensorflow 和 Keras 的新手。我想在自定义损失函数中使用样本权重。

如果我理解正确,这个 post () 建议将权重作为网络的输入。 以及这个:

我想知道我是否遗漏了什么(我也不想将权重定义为全局变量)。我也有点惊讶没有办法直接使用它,因为 Loss class _ _ call _ _ 方法接受 sample_weight 作为参数,但如果我理解正确,损失函数必须有只有参数 y_true 和 y_pred.

来自文档 (https://keras.io/api/losses/#creating-custom-losses),但是:

Creating custom losses Any callable with the signature loss_fn(y_true, y_pred) that returns an array of losses (one of sample in the input batch) can be passed to compile() as a loss. Note that sample weighting is automatically supported for any such loss.

听起来应该可以通过 model.fit(..., sample_weight=sample_weight) 方法使用样本加权。

在此post( ) 关于损失函数输出的大小有一个冗长的讨论。

And, lastly, it is also mentioned that when a custom loss function is created, then, an array of losses (individual sample losses) should be returned. Their reduction is handled by the framework.

在我看来,如果 custom_loss(y_true, y_pred) returns 是一个大小为 (batch_size, ) 的张量,那么应该能够在 fit 方法中使用 sample_weight 。我错过了什么?

非常感谢您的帮助!

代码片段:

class NegLogLikMixedGaussian(Loss):
    """
    Negative Log-Likelihood of Mixed Gaussian with:
        num_components: number of components
        mu: means of the Gaussian components
        sg: standard deviations of the Gaussian components
    """

    def __init__(self, num_params=NUM_PARAMS_MG,
                 num_components=2, name='neg_log_lik_mixed_gaussian'):
        super(NegLogLikMixedGaussian, self).__init__(name=name)
        self.num_params = num_params
        self.num_components = num_components

    def call(self, y_true, p_predict):
        """
        Rem: for MDN the output of the networks are _parameters_ of the
        predicted distribution, _not_ point-estimates

        Parameters
        ----------
        y_true: (batch_size, 1)
            Observed value of the random variable
        p_predict: (batch_size, num_components)
            Output parameters of the network given some input

        Returns
        -------
        Negative log likelihood of the batch (batch_size, 1)

        """
        alpha, mu, sg = tf.split(p_predict,
                                 num_or_size_splits=self.num_params, axis=1)
        gm = tfd.MixtureSameFamily(
            mixture_distribution=tfd.Categorical(probs=alpha),
            components_distribution=tfd.Normal(loc=mu, scale=sg))
        log_likelihood = tf.transpose(gm.log_prob(tf.transpose(y_true)))
        return -tf.reduce_mean(log_likelihood, axis=-1)

我当时的希望是能够使用:

model.compile(optimizer=Adam(learning_rate=0.005),
                  loss=NegLogLikMixedGaussian(
                      num_components=2, num_params=3))

并且:


# For testing purposes
sample_weight = np.ones(len(y_train)) / len(dh.y_train_scaled)  # this should give same results as un-weighted

# Some non-trivial weights
sample_weights = np.zeros(len(y_train))
sample_weights[:5] = 1
# This will give me same results as above


model.fit(x_train, y_train, sample_weight=sample_weight,
                      batch_size=128, epochs=10)

你的代码是正确的,除了一些细节,如果我理解你想做什么的话。 样本权重应该是维度(样本数),而损失应该是维度(batch_size)。 样本权重可以传递给 fit 方法,它似乎有效。 在您的自定义损失中,class、num_components 和 num_params 已初始化,但调用方法中仅使用了两个参数之一。 我不确定我是否理解张量的维度(alpha、mu、sg),它是否具有维度(batch_size、3、num_components)并由模型预测? 根据我对你的问题的理解,下面是根据你的代码改编的代码。

import tensorflow as tf
import numpy as np
from tensorflow.keras.losses import Loss, BinaryCrossentropy
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Dense, Concatenate

import tensorflow_probability as tfp
tfd = tfp.distributions

# parameters
num_components = 2
num_samples = 1001
num_features = 10

# synthetic data
x_train = np.random.normal(size=(num_samples, num_features))
y_train = np.random.normal(size=(num_samples, 1, num_components))

print(x_train.shape)
print(y_train.shape)

class NegLogLikMixedGaussian(Loss):
    """
    Negative Log-Likelihood of Mixed Gaussian with:
        num_components: number of components
        mu: means of the Gaussian components
        sg: standard deviations of the Gaussian components
    """

    def __init__(self, num_components=2, name='neg_log_lik_mixed_gaussian'):
        super(NegLogLikMixedGaussian, self).__init__(name=name)
        self.num_components = num_components

    def call(self, y_true, p_predict):
        """
        Rem: for MDN the output of the networks are _parameters_ of the
        predicted distribution, _not_ point-estimates

        Parameters
        ----------
        y_true: (batch_size, 1, num_components)
            Observed value of the random variable
        p_predict: (batch_size, 3, num_components)
            Output parameters of the network given some input

        Returns
        -------
        Negative log likelihood of the batch (batch_size, 1)

        """
        alpha, mu, sg = tf.split(p_predict, num_or_size_splits=3, axis=1)
        gm = tfd.MixtureSameFamily(
            mixture_distribution=tfd.Categorical(probs=alpha),
            components_distribution=tfd.Normal(loc=mu, scale=sg))
        log_likelihood = gm.log_prob(y_true)
        return -tf.reduce_mean(log_likelihood, axis=[1, 2])

# the model (simple predicting (alpha, mu, sigma))
input = Input((num_features,))
alpha = tf.expand_dims(Dense(num_components, 'relu')(input), axis=1)+0.0001
# normalization
alpha = alpha/tf.reduce_sum(alpha, axis=2, keepdims=True)
mu = tf.expand_dims(Dense(num_components)(input), axis=1)
# sg > 0
sg = tf.expand_dims(Dense(num_components, 'relu')(input), axis=1)+ 0.0001

outputs = Concatenate(axis=1)([alpha, mu, sg])

model = Model(inputs=input, outputs=outputs, name='gmm_params')
model.compile(optimizer='adam', loss=NegLogLikMixedGaussian(num_components=num_components), run_eagerly=False)

sample_weight=np.ones((num_samples, ))
sample_weight[500:] = 0.

model.fit(x_train, y_train, batch_size=16, epochs=20, sample_weight=sample_weight)