为什么我们使用损失来更新我们的模型,但使用指标来选择我们需要的模型?

Why we use the loss to update our model but use the metrics to choose the model we need?

First of all,I am confused about why we use the loss to update the model but use the metrics to choose the model we need.

Maybe not all of code, but most of the code I've seen does,they use EarlyStopping to monitor the metrics on the validation data to find the best epoch(loss and metrics are different).

Since you have chosen to use the loss to update the model, why not use the loss to select the model? After all, the loss and the metrics are not exactly the same. It gives me the impression that you do something with this purpose, and then you evaluate it with another indicator, which makes me feel very strange.Take the regression problem as an example,when someone use the 'mse' as their loss, why they define metrics=['mae'] and monitor this to early stop or reduce learning rate,I just can't understand and I want to know what is the advantages of doing this?

Secondly, when your training data is imbalance data and the problem is a classfication problem, Some of the tutorial will tell you to use the F1 or AUC as your metrics,and they say it will improve the problem caused by imbalance data.I don't know why these metrics can improve the problem caused by imbalance data.

Thirdly,I am confused about when someone send more than one metric to the parameter metrics in the function compile. I don't understand why multiple, why not one. What is the advantage of defining multiple metrics over one?

I seem to have too many questions,and they have been bothering me for a long time.

Thank you for your kind answer.


以上内容是我之前编辑的。有人觉得我的问题太笼统了,所以我想重新整理一下我的语言。

Now suppose that there is a binary classification problem, and the data is not balanced. The ratio of positive and negative classes is 500:1.

我选择 DNN 作为我的分类模型。我选择 cross entropy 作为我的 loss。 现在的问题是我应该选择 cross entropy 作为我的 metric,还是应该选择其他的,为什么?

我想说说我从别人的回答中得到的信息,就是当问题是回归问题的时候,一般的metric和loss是可微的,所以其实选择相同的metric和loss,或者不同的一个,完全取决于你自己对问题的理解。但是如果是分类问题,我们想要的metric是不可微的,所以我们会选择不同的loss和metric,比如F1AUC,都是不可微的。为什么我们不直接选择cross entropy作为度量呢?

这个问题对于 SO 来说可能过于宽泛;不过,这里有几件事希望对您有所帮助...

Since you have chosen to use the loss to update the model, why not use the loss to select the model?

因为,从数学的角度来看,损失是我们必须优化的数量,而从业务的角度来看,兴趣的数量是公制;换句话说,在一天结束时,作为模型的 users,我们感兴趣的是指标,而不是损失(至少对于这两个量为默认不同,比如在class化问题)。

也就是说,根据 loss 选择模型也是一个完全有效的策略;一如既往,有一些subjectivity,这取决于具体问题。

Take the regression problem as an example, when someone use the 'mse' as their loss, why they define metrics=['mae']

这不是常态,远非标准;通常,对于 回归 问题,使用损失作为度量也是很自然的。我同意你的看法,像你提到的那样的选择似乎不自然,而且一般来说似乎没有多大意义。请记住,因为有人在博客中使用它或其他东西并不一定会使它成为 "correct"(或一个好主意),但如果不考虑针对特定情况的可能论点,一般来说很难争论。

I don't know why these metrics [F1 or AUC] can improve the problem caused by imbalance data.

它们什么都不 "improve" - 它们只是更 合适 而不是准确性,在严重不平衡的数据集中采用天真的方法(想想 99% 的多数class) 将简单地 class 将所有内容确定为多数 class,这将在模型没有学到任何东西的情况下提供 99% 的准确率。

I am confused about when someone send more than one metric to the parameter metrics in the function compile. I don't understand why multiple, why not one. What is the advantage of defining multiple metrics over one?

同样,一般来说,没有优势,这也不是常态;但一切都取决于可能的细节。


UPDATE(评论后):将讨论限制在class化设置(因为在回归中,损失和指标可以是同一件事),类似问题出现得相当频繁,我猜是因为损失和各种可用指标(准确度、精确度、召回率、F1 分数等)之间的细微差别没有得到很好的理解;例如考虑你的问题的反面:

以及其中的链接。引用我自己的一位 :

Loss and accuracy are different things; roughly speaking, the accuracy is what we are actually interested in from a business perspective, while the loss is the objective function that the learning algorithms (optimizers) are trying to minimize from a mathematical perspective. Even more roughly speaking, you can think of the loss as the "translation" of the business objective (accuracy) to the mathematical domain, a translation which is necessary in classification problems (in regression ones, usually the loss and the business objective are the same, or at least can be the same in principle, e.g. the RMSE)...

您可能还会发现 中的讨论很有帮助。