spaCy CLI 训练 NER 中的 P、R 和 F 分数是如何计算的？

How are P, R, and F scores calculated in spaCy CLI train NER?

我正在使用 NER 的 spaCy CLI 训练命令，train_path 设置为训练数据集（训练集），dev_path 设置为评估数据集（测试集）。控制台中的打印输出显示了 NER Precision、Recall 和 F-score。

但是，我不清楚分数是如何计算出来的。它们是模型预测训练集的分数 (train-scores) 还是测试集的分数 (test-scores)？

我想确定在哪个 epoch 之后停止训练以防止过度拟合。目前，在 60 个 epoch 之后，Loss 仍在略微下降，Precision、Recall 和 F-score 仍在略微上升。在我看来，该模型可能正在记忆训练数据，并且 P、R 和 F 分数是在训练集上计算的，因此不断改进。

据我所知，在测试分数再次开始下降之前，训练的一个很好的停止点是正确的，即使训练分数不断增加。所以我想随着时间的推移（时代）比较它们。

我的问题是：

在训练训练分数或测试分数时，分数是否显示在控制台中？
以及如何访问另一个？
如果是train-score，测试集(dev_path)用的是什么？

loss 是根据训练示例计算的，作为在 training loop. However, all the other performance metrics are calculated on the dev set, by calling 中调用 nlp.update() 的副作用 Scorer。

To my knowledge a good stopping point in training would be right before the test-scores start dropping again, even though the train-scores keep increasing

是的，我同意。所以看看 spacy train 结果，这将是（训练）损失仍在减少的时候，而（开发）F-score 再次开始减少。

Currently after 60 epochs the Loss is still slightly decreasing and Precision, Recall, and F-score are still slightly increasing.

所以看起来你可以再训练一些 epochs :-)

spaCy CLI 训练 NER 中的 P、R 和 F 分数是如何计算的？

How are P, R, and F scores calculated in spaCy CLI train NER?

named-entity-recognition

spacy