keras模型训练的最高损失量是多少?

What is the highest loss amount with keras model training?

我正在使用 Keras 训练我的模型,我正在尝试读取评估统计数据。我知道损失函数的作用是什么,但最高可能值是多少?越接近零越好,但我不知道 0.2 是否合适。 我可以看到在更多迭代后损失正在下降,并且准确性也在提高。

我训练模型的代码:

def trainModel(bow,unitlabels,units):
    x_train = np.array(bow)
    print("X_train: ", x_train)
    y_train = np.array(unitlabels)
    print("Y_train: ", y_train)
    model = tf.keras.models.Sequential([
            tf.keras.layers.Dense(256, activation=tf.nn.relu),
            tf.keras.layers.Dropout(0.2),
            tf.keras.layers.Dense(len(units), activation=tf.nn.softmax)])
    model.compile(optimizer='adam',
                         loss='sparse_categorical_crossentropy',
                         metrics=['accuracy'])
    model.fit(x_train, y_train, epochs=50)
    return model

我的结果:

Epoch 1/50
1249/1249 [==============================] - 0s 361us/sample - loss: 0.8800 - acc: 0.7590
Epoch 2/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.4689 - acc: 0.8519
Epoch 3/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.3766 - acc: 0.8687
Epoch 4/50
1249/1249 [==============================] - 0s 92us/sample - loss: 0.3339 - acc: 0.8663
Epoch 5/50
1249/1249 [==============================] - 0s 89us/sample - loss: 0.3057 - acc: 0.8719
Epoch 6/50
1249/1249 [==============================] - 0s 87us/sample - loss: 0.2877 - acc: 0.8799
Epoch 7/50
1249/1249 [==============================] - 0s 88us/sample - loss: 0.2752 - acc: 0.8815
Epoch 8/50
1249/1249 [==============================] - 0s 89us/sample - loss: 0.2650 - acc: 0.8783
Epoch 9/50
1249/1249 [==============================] - 0s 92us/sample - loss: 0.2562 - acc: 0.8847
Epoch 10/50
1249/1249 [==============================] - 0s 91us/sample - loss: 0.2537 - acc: 0.8799
Epoch 11/50
1249/1249 [==============================] - 0s 89us/sample - loss: 0.2468 - acc: 0.8903
Epoch 12/50
1249/1249 [==============================] - 0s 88us/sample - loss: 0.2436 - acc: 0.8927
Epoch 13/50
1249/1249 [==============================] - 0s 89us/sample - loss: 0.2420 - acc: 0.8935
Epoch 14/50
1249/1249 [==============================] - 0s 88us/sample - loss: 0.2366 - acc: 0.8935
Epoch 15/50
1249/1249 [==============================] - 0s 94us/sample - loss: 0.2305 - acc: 0.8951
Epoch 16/50
1249/1249 [==============================] - 0s 98us/sample - loss: 0.2265 - acc: 0.8991
Epoch 17/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2280 - acc: 0.8967
Epoch 18/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2247 - acc: 0.8951
Epoch 19/50
1249/1249 [==============================] - 0s 92us/sample - loss: 0.2237 - acc: 0.8975
Epoch 20/50
1249/1249 [==============================] - 0s 102us/sample - loss: 0.2196 - acc: 0.8991
Epoch 21/50
1249/1249 [==============================] - 0s 102us/sample - loss: 0.2223 - acc: 0.8983
Epoch 22/50
1249/1249 [==============================] - 0s 102us/sample - loss: 0.2163 - acc: 0.8943
Epoch 23/50
1249/1249 [==============================] - 0s 100us/sample - loss: 0.2177 - acc: 0.8983
Epoch 24/50
1249/1249 [==============================] - 0s 101us/sample - loss: 0.2165 - acc: 0.8983
Epoch 25/50
1249/1249 [==============================] - 0s 100us/sample - loss: 0.2148 - acc: 0.9007
Epoch 26/50
1249/1249 [==============================] - 0s 98us/sample - loss: 0.2189 - acc: 0.8903
Epoch 27/50
1249/1249 [==============================] - 0s 98us/sample - loss: 0.2099 - acc: 0.9023
Epoch 28/50
1249/1249 [==============================] - 0s 98us/sample - loss: 0.2102 - acc: 0.9023
Epoch 29/50
1249/1249 [==============================] - 0s 94us/sample - loss: 0.2091 - acc: 0.8975
Epoch 30/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2064 - acc: 0.9015
Epoch 31/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2044 - acc: 0.9023
Epoch 32/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2070 - acc: 0.9031
Epoch 33/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2045 - acc: 0.9039
Epoch 34/50
1249/1249 [==============================] - 0s 94us/sample - loss: 0.2007 - acc: 0.9063
Epoch 35/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.1999 - acc: 0.9055
Epoch 36/50
1249/1249 [==============================] - 0s 103us/sample - loss: 0.2010 - acc: 0.9039
Epoch 37/50
1249/1249 [==============================] - 0s 111us/sample - loss: 0.2053 - acc: 0.9031
Epoch 38/50
1249/1249 [==============================] - 0s 99us/sample - loss: 0.2018 - acc: 0.9039
Epoch 39/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2023 - acc: 0.9055
Epoch 40/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2019 - acc: 0.9015
Epoch 41/50
1249/1249 [==============================] - 0s 92us/sample - loss: 0.2040 - acc: 0.8983
Epoch 42/50
1249/1249 [==============================] - 0s 103us/sample - loss: 0.2033 - acc: 0.8943
Epoch 43/50
1249/1249 [==============================] - 0s 97us/sample - loss: 0.2024 - acc: 0.9039
Epoch 44/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.2047 - acc: 0.9079
Epoch 45/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.1996 - acc: 0.9039
Epoch 46/50
1249/1249 [==============================] - 0s 91us/sample - loss: 0.1979 - acc: 0.9079
Epoch 47/50
1249/1249 [==============================] - 0s 90us/sample - loss: 0.1960 - acc: 0.9087
Epoch 48/50
1249/1249 [==============================] - 0s 97us/sample - loss: 0.1969 - acc: 0.9055
Epoch 49/50
1249/1249 [==============================] - 0s 99us/sample - loss: 0.1950 - acc: 0.9087
Epoch 50/50
1249/1249 [==============================] - 0s 98us/sample - loss: 0.1956 - acc: 0.9071

交叉熵损失的最大损失发生在你的 classes 均匀分布时,没有任何 class 的倾向并且你获得最大熵。看公式:

可以计算出最大损失,通常对数使用自然对数ln。由于您将有 1 个热门目标,因此总和将减少到 -log(a^(i)_k) 并采用统一假设 a^(i) = 1/len(units)。例如,在二进制 classification 中,设置 a=0.5-ln(0.5) ~ 0.693147,因此最大损失约为 0.69。