LSTM 神经网络中的损失函数

Question

我不明白这些网络中正在最小化什么。有人可以解释当 LSTM 网络中的损失变小时数学上发生了什么吗？

model.compile(loss='categorical_crossentropy', optimizer='adam')

Answer 1

来自keras documentation, categorical_crossentropy is just the multiclass logloss. Math and theoretical explanation for log loss here。

基本上，LSTM 正在为单词（或字符，具体取决于您的模型）分配标签，并通过惩罚单词（或字符）序列中的错误标签来优化模型。该模型采用输入词或字符向量，并尝试根据训练示例猜测下一个 "best" 词。分类交叉熵是一种衡量猜测好坏的定量方法。当模型在训练集上迭代时，它在猜测下一个最佳单词（或字符）时犯的错误更少。

LSTM 神经网络中的损失函数

loss function in LSTM neural network

deep-learning

lstm

keras