目标标签仅包含允许标签的子集时的 F1 分数计算

Question

我有一个 3-classclass化问题。
我预测的标签由所有三个 class 组成。
但是，我的目标标签只有三个 class 中的两个存在。
例如：

predicted = [1,1,2,3,2,1]
target = [1,1,2,2,2,1]

在这种情况下，我应该如何计算 F1 分数？
我目前正在使用 sklearn 的 f1_score 函数，平均值为 macro。
但这会导致上述情况的 F1 分值较低。

Answer 1

f1_score 提供了一个名为 labels 的参数，让您可以定义一组要包含的标签，以防 average != 'binary'.

例如，如果您只对类 1 和 2 分类器的性能感兴趣，您可以这样做：

from sklearn.metrics import f1_score


predicted = [1, 1, 2, 3, 2, 1]
target = [1, 1, 2, 2, 2, 1]

print(f1_score(target, predicted, average='macro', labels=[1, 2]))
# 0.9

目标标签仅包含允许标签的子集时的 F1 分数计算

F1 score computation when target labels contain only a subset of the allowed labels

classification

machine-learning

scikit-learn

multiclass-classification