为什么在 Keras LSTM 训练中有 3 个损失但有 2 个准确度？

Question

我的模型是这样的：

def _get_model(input_shape, latent_dim, num_classes):

  inputs = Input(shape=input_shape)
  lstm_lyr,state_h,state_c = LSTM(latent_dim,dropout=0.1,return_state = True)(inputs)
  fc_lyr = Dense(num_classes)(lstm_lyr)
  soft_lyr = Activation('relu')(fc_lyr)
  model = Model(inputs, [soft_lyr,state_c])
  model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
return model
model =_get_model((n_steps_in, n_features),latent_dim ,n_steps_out)
history = model.fit(X_train,Y_train)

在训练期间我得到：

Epoch 1/2000
1/1 [==============================] - 1s 698ms/step - loss: 0.2338 - activation_26_loss: 0.1153 - lstm_151_loss: 0.1185 - activation_26_accuracy: 0.0000e+00 - lstm_151_accuracy: 0.0000e+00 - val_loss: 0.2341 - val_activation_26_loss: 0.1160 - val_lstm_151_loss: 0.1181 - val_activation_26_accuracy: 0.0000e+00 - val_lstm_151_accuracy: 0.0000e+00
Epoch 2/2000
1/1 [==============================] - 0s 34ms/step - loss: 0.2328 - activation_26_loss: 0.1153 - lstm_151_loss: 0.1175 - activation_26_accuracy: 0.0000e+00 - lstm_151_accuracy: 0.0000e+00 - val_loss: 0.2329 - val_activation_26_loss: 0.1160 - val_lstm_151_loss: 0.1169 - val_activation_26_accuracy: 0.0000e+00 - val_lstm_151_accuracy: 0.0000e+00
Epoch 3/2000
1/1 [==============================] - 0s 38ms/step - loss: 0.2316 - activation_26_loss: 0.1153 - lstm_151_loss: 0.1163 - activation_26_accuracy: 0.0000e+00 - lstm_151_accuracy: 0.0000e+00 - val_loss: 0.2315 - val_activation_26_loss: 0.1160 - val_lstm_151_loss: 0.1155 - val_activation_26_accuracy: 0.0000e+00 - val_lstm_151_accuracy: 0.0000e+00

当我看到历史时：

 print (history.history.keys)
    dict_keys(['loss', 'activation_26_loss', 'lstm_151_loss', 'activation_26_accuracy', 'lstm_151_accuracy', 'val_loss', 'val_activation_26_loss', 'val_lstm_151_loss', 'val_activation_26_accuracy', 'val_lstm_151_accuracy'])

training loss 和 training accuracy 是哪个？
既然只有2个输出，为什么有3个损失，loss、activation_26_loss和lstm_151_loss BUT 2个准确率：activation_26_accuracy和lstm_151_accuracy ？每个损失和每个准确度代表什么？

Answer 1

TLDR;

三个损失 (2+1)，两个单独输出的损失，一个作为 2 个损失的组合，每个损失的权重为 0.5。您可以明确设置损失及其权重。
两个准确度，因为有 2 个输出。 metrics仅供用户查看，不影响神经网络。

详细解释；

让我们先看看你在这里做什么。（我指的是您要求获取输入形状的。

from tensorflow.keras import layers, Model, utils

def _get_model(input_shape, latent_dim, num_classes):
    inputs = layers.Input(shape=input_shape)
    lstm_lyr,state_h,state_c = layers.LSTM(latent_dim,dropout=0.1,return_state = True)(inputs)
    fc_lyr = layers.Dense(num_classes)(lstm_lyr)
    soft_lyr = layers.Activation('relu')(fc_lyr)
    model = Model(inputs, [soft_lyr,state_c])   #<------- One input, 2 outputs
    model.compile(optimizer='adam', loss='mse')
    return model


#Dummy data
X = np.random.random((100,15,5))
y1 = np.random.random((100,4))
y2 = np.random.random((100,7))

model =_get_model((15, 5), 7 , 4)

You are building a supervised model that takes an input of (15,5) shape and outputs 2 things: first a (7,) which should contain the cell_states from the 7 LSTM cells and second a (4,) vector that should contain probability values for the 4 classes. The loss you are using to train the model for learning how to predict both of the outputs is mse.

由于这是一个监督模型，您必须提供输入和输出的模型样本。如果你有 100 个样本，那么你的输入将是 (100,15,5) 形状，你的输出将是 (100,7) 和 (100,4)，因为你有 2 个输出。

Loss(y_actual, y_pred) 是一个函数，它告诉神经网络它的预测值与实际值有多远。基于此，它告诉神经网络更新自身（它的权重具体使用反向传播），使其预测越来越接近实际，从而减少损失。

如果以上几点都清楚了那我们就来看看这个网络具体是干什么的

您当前的模型有 1 个输入和 2 个输出。

model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])

由于您已将 mse 定义为损失，因此两个输出都试图最小化 mse。这些是 3 个损失中的 2 个：activation_26_loss 是最终 Dense 层的损失，lstm_151_loss 是 LSTM cell state 的损失。除非正确指定，否则 Keras 只是用数字为这些层随机命名。
提到的loss基本上是其他2个损失的加权平均。 这个以后再说
metrics=['accuracy']只是用户跟踪的指标。由于有 2 个输出，您将获得 2 个不同的准确度指标，每个输出一个。它们不会影响神经网络的训练。

现在，在使用神经网络时，重要的是要知道在哪里使用哪种损失。这是一个 table 描述用于哪种类型的网络的损失函数和激活函数。

如您所见，使用 softmax 和 categorical_crossentropy 解决多 class 问题是一个很好的做法。因此，让我们尝试使用此更改重新创建模型。我们希望每个输出都有不同的损失以最小化。

此外，假设第一个输出比第二个更重要。我们还可以告诉模型如何权衡损失，以便它优先考虑更关注哪些损失以及关注多少。

from tensorflow.keras import layers, Model, utils

def _get_model(input_shape, latent_dim, num_classes):
    inputs = layers.Input(shape=input_shape)
    lstm_lyr,state_h,state_c = layers.LSTM(latent_dim,dropout=0.1,return_state = True)(inputs)
    fc_lyr = layers.Dense(num_classes)(lstm_lyr)
    soft_lyr = layers.Activation('softmax')(fc_lyr)
    model = Model(inputs, [soft_lyr,state_c])             #<--- Softmax for first outputs activation
    model.compile(optimizer='adam',                      
                  loss=['categorial_crossentropy','mse'], #<--- 2 losses, one for each output
                  loss_weights=[0.4, 0.6])                #<--- 2 loss weights for final loss
    return model

#Dummy data
X = np.random.random((100,15,5))
y1 = np.random.random((100,4))
y2 = np.random.random((100,7))

model =_get_model((15, 5), 7 , 4)
utils.plot_model(model, show_layer_names=False, show_shapes=True)

这里，最后的 loss（简称为损失）是将 2 个单独的损失与 0.4 和 0.6 权重组合后的组合。

希望这能阐明您要实现的目标。

ONE A SIDE NOTE: I am curious as to how you are getting the actual values for the final cell state to train the model to predict a cell state. Do let me know if that is what your intention is. It's not very clear what your final goal here is (as I had asked your previous question as well).

为什么在 Keras LSTM 训练中有 3 个损失但有 2 个准确度？

Why are there 3 losses BUT 2 accuracies in Keras LSTM training?

lstm

keras

tensorflow

TLDR;

详细解释；