计算训练纪元函数的损失

Calculate loss in train epoch function

我们有三种损失

我理解的loss是张量，batch loss是张量的值，train_loss是累加值batch_loss这对我来说没问题

我的问题是为什么 AllenNLP 考虑了批处理中的 batch_loss 而没有计算 batch_group 的累积损失？

我也不明白在epoch里面需要batch_group，在batch_group

里面需要batch

这是我的理解我们里面有纪元我们在 batch_group 里面有 batch_group 我们有批处理 batch_loss 是针对批处理计算的，而不是针对 batch_group 为什么？

my question is why AllenNLP considered the batch_loss in for batch and did not calculate the cumulative loss for batch_group?

这实际上是一个错误，感谢您指出！现在有一个开放的 PR 来修复它：https://github.com/allenai/allennlp/pull/4706

Also I did not understand the need for batch_group inside epoch, and batch inside batch_group

batch_group 总是只包含一个 batch 除非您使用的 num_gradient_accumulation_steps 大于 1，即您使用的是梯度累积，这是一种获得更大的有效批量大小。