具有样本权重的 Keras 中的自定义损失函数
Custom Loss Function in Keras with Sample Weights
我是 Tensorflow 和 Keras 的新手。我想在自定义损失函数中使用样本权重。
如果我理解正确,这个 post () 建议将权重作为网络的输入。
以及这个:
我想知道我是否遗漏了什么(我也不想将权重定义为全局变量)。我也有点惊讶没有办法直接使用它,因为 Loss class _ _ call _ _ 方法接受 sample_weight 作为参数,但如果我理解正确,损失函数必须有只有参数 y_true 和 y_pred.
来自文档 (https://keras.io/api/losses/#creating-custom-losses),但是:
Creating custom losses
Any callable with the signature loss_fn(y_true, y_pred) that returns an array of losses (one of sample in the input batch) can be passed to compile() as a loss. Note that sample weighting is automatically supported for any such loss.
听起来应该可以通过 model.fit(..., sample_weight=sample_weight) 方法使用样本加权。
在此post(
) 关于损失函数输出的大小有一个冗长的讨论。
And, lastly, it is also mentioned that when a custom loss function is created, then, an array of losses (individual sample losses) should be returned. Their reduction is handled by the framework.
在我看来,如果 custom_loss(y_true, y_pred) returns 是一个大小为 (batch_size, ) 的张量,那么应该能够在 fit 方法中使用 sample_weight 。我错过了什么?
非常感谢您的帮助!
代码片段:
class NegLogLikMixedGaussian(Loss):
"""
Negative Log-Likelihood of Mixed Gaussian with:
num_components: number of components
mu: means of the Gaussian components
sg: standard deviations of the Gaussian components
"""
def __init__(self, num_params=NUM_PARAMS_MG,
num_components=2, name='neg_log_lik_mixed_gaussian'):
super(NegLogLikMixedGaussian, self).__init__(name=name)
self.num_params = num_params
self.num_components = num_components
def call(self, y_true, p_predict):
"""
Rem: for MDN the output of the networks are _parameters_ of the
predicted distribution, _not_ point-estimates
Parameters
----------
y_true: (batch_size, 1)
Observed value of the random variable
p_predict: (batch_size, num_components)
Output parameters of the network given some input
Returns
-------
Negative log likelihood of the batch (batch_size, 1)
"""
alpha, mu, sg = tf.split(p_predict,
num_or_size_splits=self.num_params, axis=1)
gm = tfd.MixtureSameFamily(
mixture_distribution=tfd.Categorical(probs=alpha),
components_distribution=tfd.Normal(loc=mu, scale=sg))
log_likelihood = tf.transpose(gm.log_prob(tf.transpose(y_true)))
return -tf.reduce_mean(log_likelihood, axis=-1)
我当时的希望是能够使用:
model.compile(optimizer=Adam(learning_rate=0.005),
loss=NegLogLikMixedGaussian(
num_components=2, num_params=3))
并且:
# For testing purposes
sample_weight = np.ones(len(y_train)) / len(dh.y_train_scaled) # this should give same results as un-weighted
# Some non-trivial weights
sample_weights = np.zeros(len(y_train))
sample_weights[:5] = 1
# This will give me same results as above
model.fit(x_train, y_train, sample_weight=sample_weight,
batch_size=128, epochs=10)
你的代码是正确的,除了一些细节,如果我理解你想做什么的话。
样本权重应该是维度(样本数),而损失应该是维度(batch_size)。
样本权重可以传递给 fit 方法,它似乎有效。
在您的自定义损失中,class、num_components 和 num_params 已初始化,但调用方法中仅使用了两个参数之一。
我不确定我是否理解张量的维度(alpha、mu、sg),它是否具有维度(batch_size、3、num_components)并由模型预测?
根据我对你的问题的理解,下面是根据你的代码改编的代码。
import tensorflow as tf
import numpy as np
from tensorflow.keras.losses import Loss, BinaryCrossentropy
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Dense, Concatenate
import tensorflow_probability as tfp
tfd = tfp.distributions
# parameters
num_components = 2
num_samples = 1001
num_features = 10
# synthetic data
x_train = np.random.normal(size=(num_samples, num_features))
y_train = np.random.normal(size=(num_samples, 1, num_components))
print(x_train.shape)
print(y_train.shape)
class NegLogLikMixedGaussian(Loss):
"""
Negative Log-Likelihood of Mixed Gaussian with:
num_components: number of components
mu: means of the Gaussian components
sg: standard deviations of the Gaussian components
"""
def __init__(self, num_components=2, name='neg_log_lik_mixed_gaussian'):
super(NegLogLikMixedGaussian, self).__init__(name=name)
self.num_components = num_components
def call(self, y_true, p_predict):
"""
Rem: for MDN the output of the networks are _parameters_ of the
predicted distribution, _not_ point-estimates
Parameters
----------
y_true: (batch_size, 1, num_components)
Observed value of the random variable
p_predict: (batch_size, 3, num_components)
Output parameters of the network given some input
Returns
-------
Negative log likelihood of the batch (batch_size, 1)
"""
alpha, mu, sg = tf.split(p_predict, num_or_size_splits=3, axis=1)
gm = tfd.MixtureSameFamily(
mixture_distribution=tfd.Categorical(probs=alpha),
components_distribution=tfd.Normal(loc=mu, scale=sg))
log_likelihood = gm.log_prob(y_true)
return -tf.reduce_mean(log_likelihood, axis=[1, 2])
# the model (simple predicting (alpha, mu, sigma))
input = Input((num_features,))
alpha = tf.expand_dims(Dense(num_components, 'relu')(input), axis=1)+0.0001
# normalization
alpha = alpha/tf.reduce_sum(alpha, axis=2, keepdims=True)
mu = tf.expand_dims(Dense(num_components)(input), axis=1)
# sg > 0
sg = tf.expand_dims(Dense(num_components, 'relu')(input), axis=1)+ 0.0001
outputs = Concatenate(axis=1)([alpha, mu, sg])
model = Model(inputs=input, outputs=outputs, name='gmm_params')
model.compile(optimizer='adam', loss=NegLogLikMixedGaussian(num_components=num_components), run_eagerly=False)
sample_weight=np.ones((num_samples, ))
sample_weight[500:] = 0.
model.fit(x_train, y_train, batch_size=16, epochs=20, sample_weight=sample_weight)
我是 Tensorflow 和 Keras 的新手。我想在自定义损失函数中使用样本权重。
如果我理解正确,这个 post (
我想知道我是否遗漏了什么(我也不想将权重定义为全局变量)。我也有点惊讶没有办法直接使用它,因为 Loss class _ _ call _ _ 方法接受 sample_weight 作为参数,但如果我理解正确,损失函数必须有只有参数 y_true 和 y_pred.
来自文档 (https://keras.io/api/losses/#creating-custom-losses),但是:
Creating custom losses Any callable with the signature loss_fn(y_true, y_pred) that returns an array of losses (one of sample in the input batch) can be passed to compile() as a loss. Note that sample weighting is automatically supported for any such loss.
听起来应该可以通过 model.fit(..., sample_weight=sample_weight) 方法使用样本加权。
在此post(
And, lastly, it is also mentioned that when a custom loss function is created, then, an array of losses (individual sample losses) should be returned. Their reduction is handled by the framework.
在我看来,如果 custom_loss(y_true, y_pred) returns 是一个大小为 (batch_size, ) 的张量,那么应该能够在 fit 方法中使用 sample_weight 。我错过了什么?
非常感谢您的帮助!
代码片段:
class NegLogLikMixedGaussian(Loss):
"""
Negative Log-Likelihood of Mixed Gaussian with:
num_components: number of components
mu: means of the Gaussian components
sg: standard deviations of the Gaussian components
"""
def __init__(self, num_params=NUM_PARAMS_MG,
num_components=2, name='neg_log_lik_mixed_gaussian'):
super(NegLogLikMixedGaussian, self).__init__(name=name)
self.num_params = num_params
self.num_components = num_components
def call(self, y_true, p_predict):
"""
Rem: for MDN the output of the networks are _parameters_ of the
predicted distribution, _not_ point-estimates
Parameters
----------
y_true: (batch_size, 1)
Observed value of the random variable
p_predict: (batch_size, num_components)
Output parameters of the network given some input
Returns
-------
Negative log likelihood of the batch (batch_size, 1)
"""
alpha, mu, sg = tf.split(p_predict,
num_or_size_splits=self.num_params, axis=1)
gm = tfd.MixtureSameFamily(
mixture_distribution=tfd.Categorical(probs=alpha),
components_distribution=tfd.Normal(loc=mu, scale=sg))
log_likelihood = tf.transpose(gm.log_prob(tf.transpose(y_true)))
return -tf.reduce_mean(log_likelihood, axis=-1)
我当时的希望是能够使用:
model.compile(optimizer=Adam(learning_rate=0.005),
loss=NegLogLikMixedGaussian(
num_components=2, num_params=3))
并且:
# For testing purposes
sample_weight = np.ones(len(y_train)) / len(dh.y_train_scaled) # this should give same results as un-weighted
# Some non-trivial weights
sample_weights = np.zeros(len(y_train))
sample_weights[:5] = 1
# This will give me same results as above
model.fit(x_train, y_train, sample_weight=sample_weight,
batch_size=128, epochs=10)
你的代码是正确的,除了一些细节,如果我理解你想做什么的话。 样本权重应该是维度(样本数),而损失应该是维度(batch_size)。 样本权重可以传递给 fit 方法,它似乎有效。 在您的自定义损失中,class、num_components 和 num_params 已初始化,但调用方法中仅使用了两个参数之一。 我不确定我是否理解张量的维度(alpha、mu、sg),它是否具有维度(batch_size、3、num_components)并由模型预测? 根据我对你的问题的理解,下面是根据你的代码改编的代码。
import tensorflow as tf
import numpy as np
from tensorflow.keras.losses import Loss, BinaryCrossentropy
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Dense, Concatenate
import tensorflow_probability as tfp
tfd = tfp.distributions
# parameters
num_components = 2
num_samples = 1001
num_features = 10
# synthetic data
x_train = np.random.normal(size=(num_samples, num_features))
y_train = np.random.normal(size=(num_samples, 1, num_components))
print(x_train.shape)
print(y_train.shape)
class NegLogLikMixedGaussian(Loss):
"""
Negative Log-Likelihood of Mixed Gaussian with:
num_components: number of components
mu: means of the Gaussian components
sg: standard deviations of the Gaussian components
"""
def __init__(self, num_components=2, name='neg_log_lik_mixed_gaussian'):
super(NegLogLikMixedGaussian, self).__init__(name=name)
self.num_components = num_components
def call(self, y_true, p_predict):
"""
Rem: for MDN the output of the networks are _parameters_ of the
predicted distribution, _not_ point-estimates
Parameters
----------
y_true: (batch_size, 1, num_components)
Observed value of the random variable
p_predict: (batch_size, 3, num_components)
Output parameters of the network given some input
Returns
-------
Negative log likelihood of the batch (batch_size, 1)
"""
alpha, mu, sg = tf.split(p_predict, num_or_size_splits=3, axis=1)
gm = tfd.MixtureSameFamily(
mixture_distribution=tfd.Categorical(probs=alpha),
components_distribution=tfd.Normal(loc=mu, scale=sg))
log_likelihood = gm.log_prob(y_true)
return -tf.reduce_mean(log_likelihood, axis=[1, 2])
# the model (simple predicting (alpha, mu, sigma))
input = Input((num_features,))
alpha = tf.expand_dims(Dense(num_components, 'relu')(input), axis=1)+0.0001
# normalization
alpha = alpha/tf.reduce_sum(alpha, axis=2, keepdims=True)
mu = tf.expand_dims(Dense(num_components)(input), axis=1)
# sg > 0
sg = tf.expand_dims(Dense(num_components, 'relu')(input), axis=1)+ 0.0001
outputs = Concatenate(axis=1)([alpha, mu, sg])
model = Model(inputs=input, outputs=outputs, name='gmm_params')
model.compile(optimizer='adam', loss=NegLogLikMixedGaussian(num_components=num_components), run_eagerly=False)
sample_weight=np.ones((num_samples, ))
sample_weight[500:] = 0.
model.fit(x_train, y_train, batch_size=16, epochs=20, sample_weight=sample_weight)