为什么 Keras MAPE 指标在训练期间呈爆炸式增长，但 MSE 损失却没有？

Question

我在 Keras 中实施了一个 LSTM 以重现 this paper。奇怪的行为很简单：我有一个 MSE 损失函数和一个 MAPE 和 MAE 作为指标。在训练期间，MAPE 正在爆炸，但 MSE 和 MAE 似乎训练正常：

Epoch 1/20
275/275 [==============================] - 191s 693ms/step - loss: 0.1005 - mape: 15794.8682 - mae: 0.2382 - val_loss: 0.0334 - val_mape: 24.9470 - val_mae: 0.1607
Epoch 2/20
275/275 [==============================] - 184s 669ms/step - loss: 0.0099 - mape: 6385.5464 - mae: 0.0725 - val_loss: 0.0078 - val_mape: 11.3268 - val_mae: 0.0803
Epoch 3/20
275/275 [==============================] - 186s 676ms/step - loss: 0.0025 - mape: 5909.3735 - mae: 0.0369 - val_loss: 0.0131 - val_mape: 14.9827 - val_mae: 0.1061
Epoch 4/20
275/275 [==============================] - 187s 678ms/step - loss: 0.0015 - mape: 4746.2788 - mae: 0.0278 - val_loss: 0.0142 - val_mape: 16.1894 - val_mae: 0.1122
Epoch 5/20
 30/275 [==>...........................] - ETA: 2:38 - loss: 0.0012 - mape: 9.3647 - mae: 0.0246

MAPE 在每个纪元结束时爆炸。这种特定行为的原因可能是什么？

MAPE 仍在随着每个 epoch 下降，所以这不是真正的问题，因为它不会阻碍训练过程吗？

Answer 1

你的损失和 MAPE 正在减少，听起来不错。但是如果您害怕 MAPE 中的高值，您可以判断是否有接近零的 Y 值。因为MAPE是百分比误差。

MAPE 结果可能具有误导性。来自 Wikipedia:

Although the concept of MAPE sounds very simple and convincing, it has major drawbacks in practical application, and there are many studies on shortcomings and misleading results from MAPE.

It cannot be used if there are zero values (which sometimes happens for example in demand data) because there would be a division by zero.

For forecasts which are too low the percentage error cannot exceed 100%, but for forecasts which are too high there is no upper limit to the percentage error.

MAPE puts a heavier penalty on negative errors, than on positive errors.

To overcome these issues with MAPE, there are some other measures proposed in literature:

Mean Absolute Scaled Error (MASE)

Symmetric Mean Absolute Percentage Error (sMAPE)

Mean Directional Accuracy (MDA)

Mean Arctangent Absolute Percentage Error (MAAPE)

为什么 Keras MAPE 指标在训练期间呈爆炸式增长，但 MSE 损失却没有？

Why Keras MAPE metric is exploding during training but MSE loss is not?

python

lstm

keras

tensorflow

lstm-stateful