在 Python 中使用 Keras 的自定义奖励损失函数

Question

我有一个模型，我想建立一个自定义损失函数，我有我的状态，这是我的 X 值，然后我有我的动作，这是 7 个单热分类值，它们是我的 Y 值，我正在预测。

但是我不确定如何将奖励传递给损失函数。我也不确定实际的功能应该是什么，但我可以稍后再试验。

x = input_data[:, :-2]  # States
y = input_data[:, -2]  # Actions
r = input_data[:, -1]  # Rewards

def custom_loss(y_pred, y_true):
     loss = K.square(y_pred - y_true) * r
     return loss

model.compile(loss=custom_loss, optimizer='adam', metrics=['accuracy'])
model.fit(x, y)

Answer 1

您可以编写一个函数 returns 另一个函数。您将奖励作为参数传递给顶部函数：

def penalized_loss(reward):
  def custom_loss(y_true, y_pred):
    return K.mean(K.square(y_pred - y_true) - K.square(y_true - reward), axis=-1)

  return custom_loss

.
.
.
model.compile(loss=[penalized_loss(reward=r)], optimizer='adam', metrics=['accuracy'])

我还提供了一个非常愚蠢的工作示例的要点： https://gist.github.com/kolygri/c222adba4dff710c6c53bf83c0ed5d21

在 Python 中使用 Keras 的自定义奖励损失函数

Custom Loss Function for Reward using Keras in Python

python

reinforcement-learning

loss-function