LGBM 中的二进制对数损失与网上的导数计算不同

Binary log loss in LGBM not as per derivative calculations found online

我正在使用根据 https://www.derivative-calculator.net 计算的一阶和二阶导数重新创建 LightGBM 二进制对数损失函数。

但我的绘图与 LightGBM 中原始定义的实际绘图不同

为什么剧情不一样?我计算导数的方法不对吗?

据我们所知, loss = -y_true log(y_pred) - (1-y_true) log(1-y_pred) 其中 y_pred = sigmoid(logits)

这是计算器找到的, -y log(1/(1+e^-x)) - (1-y) log(1-1/(1+e^-x))

=

并且

=

当我使用代码在上面绘制时,

def custom_odds_loss(y_true, y_pred):
    y = y_true
    # ======================
    # Inverse sigmoid
    # ======================
    epsilon_ = 1e-7
    y_pred = np.clip(y_pred, epsilon_, 1 - epsilon_)
    y_pred = np.log(y_pred/(1-y_pred))
    # ======================
    
    grad = -((y-1)*np.exp(y_pred)+y)/(np.exp(y_pred)+1)
    hess = np.exp(y_pred)/(np.exp(y_pred)+1)**2
    
    return grad, hess

# Penalty chart for True 1s all the time
y_true_k = np.ones((1000, 1))
y_pred_k = np.expand_dims(np.linspace(0, 1, 1000), axis=1)
grad, hess = custom_odds_loss(y_true_k, y_pred_k)
data_ = {
    'Payoff@grad': grad.flatten(),
}
pd.DataFrame(data_).plot(title='Target=1(G)|Penalty(y-axis) vs Probability/1000. (x-axis)');
data_ = {
    'Payoff@hess': hess.flatten(),
}
pd.DataFrame(data_).plot(title='Target=1(H)|Penalty(y-axis) vs Probability/1000. (x-axis)');

现在,LightGBM的实际剧情,

def custom_odds_loss(y_true, y_pred):
    # ======================
    # Inverse sigmoid
    # ======================
    epsilon_ = 1e-7
    y_pred = np.clip(y_pred, epsilon_, 1 - epsilon_)
    y_pred = np.log(y_pred/(1-y_pred))
    # ======================

    grad = y_pred - y_true
    hess = y_pred * (1. - y_pred)
    return grad, hess

# Penalty chart for True 1s all the time
y_true_k = np.ones((1000, 1))
y_pred_k = np.expand_dims(np.linspace(0, 1, 1000), axis=1)

grad, hess = custom_odds_loss(y_true_k, y_pred_k)

data_ = {
    'Payoff@grad': grad.flatten(),
}
pd.DataFrame(data_).plot(title='Target=1(G)|Penalty(y-axis) vs Probability/1000. (x-axis)');
data_ = {
    'Payoff@hess': hess.flatten(),
}
pd.DataFrame(data_).plot(title='Target=1(H)|Penalty(y-axis) vs Probability/1000. (x-axis)');

第二个函数不需要反转sigmoid

你看,你找到的导数可以简化为:

这种简化使我们无需反转任何东西并像这样简单地找到梯度和二阶导数:

def custom_odds_loss(y_true, y_pred):

    grad = y_pred - y_true
    hess = y_pred * (1. - y_pred)
    return grad, hess