scikit 中加权指标的含义：更大 class 权重更大或更小 class 权重更大？

meaning of weighted metrics in scikit: bigger class more weight or smaller class more weight?

我正在处理一个不平衡的数据集，并尝试使用验证指标来处理它。在 scikit 文档中，我为 weighted 找到了以下内容：

Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

计算支持度加权的平均值是否意味着 class 样本较多 的权重 高于样本较少的样本，或者，看起来是更合乎逻辑的是，较小的 class 比较大的 权重更高。

我在文档中找不到任何内容，想确保我选择了正确的指标。

谢谢！

简答：按支持加权意味着支持越高权重越高。这意味着 某个 class 的样本越多，其分数的权重就越高 。

话虽这么说，但请注意，您并没有 "handled" class 不平衡，只是为您的指标选择了另一种计算方法。我相信它们旨在为您提供模型性能的另一个视角。

通常情况下，模型在大多数 class 上的表现要好得多。使用加权指标会过分强调这一点。但是该模型在少数 class(es) 上的表现仍然相同，可能很差。如果他们碰巧是重要的，你可能最终只是在自欺欺人。

scikit 中加权指标的含义：更大 class 权重更大或更小 class 权重更大？

meaning of weighted metrics in scikit: bigger class more weight or smaller class more weight?

metrics

scikit-learn

imbalanced-data