如何为随机森林设置 class_weight 词典？

Question

我正在处理一个不平衡的数据集，所以我决定使用权重字典进行分类。

文档说必须按如下所示定义权重字典： https://imbalanced-learn.org/stable/generated/imblearn.ensemble.BalancedRandomForestClassifier.html

     weight_dict = [{0: 1, 1: 1}, {0: 1, 1: 5}, {0: 1, 1: 1}, {0: 1, 1: 1}]

所以，因为我想预测位于最后一列的 12 类。我假设设置如下：

weight_dict = [{0: 1, 1: 5.77390289e-01}, {0: 1, 1: 6.48317326e-01}, 
               {0: 1, 1: 1.35324885e-01}, {0: 1, 1: 2.92665797e+00}, 
               {0: 1, 1: 5.77858906e+01}, {0: 1, 1: 1.73193507e+00},
               {0: 1, 1: 9.27828244e+00}, {0: 1, 1: 1.18766082e+01}, 
               {0: 1, 1: 8.99009985e+01}, {0: 1, 1: 6.39833279e+00}, 
               {0: 1, 1: 2.55347077e+01}, {0: 1, 1: 9.47015372e+02}]

老实说，我不太清楚第一个指标的表示法，我的意思是：

      0:1 of {0: 1, 1: 1}

或：

 1: value.

它们代表列位置、标签顺序吗？

正确的设置方法是什么？

非常感谢您的见解。

Answer 1

I don't clearly understand the notation of the first indicators 0:1 of {0: 1, 1: 1}

表示法是{<class label> : <count>}。 class 标签采用其原始（即未转换）表示形式。

例如，以下命令生成包含 25 个“setosa”样本和 50 个“versicolor”和“virginica”样本的 Iris 训练集：

weight_dict = {"setosa" : 25, "versicolor" : 50, "virginica" : 50}

如何为随机森林设置 class_weight 词典？

How to set a class_weight Dictionary for Random Forest?

classification

random-forest

imblearn

imbalanced-data