LinearSVC 中参数 class_weight 的最佳值是多少?

What is the best value for the parameter class_weight in LinearSVC?

我有一个多标签数据(有些 类 有 2 个标签,有些有 10 个标签),我的模型过度拟合平衡,None values.What 是要设置的最佳值class_weight 参数。

from sklearn.svm import LinearSVC
svm = LinearSVC(C=0.01,max_iter=100,dual=False,class_weight=None,verbose=1)

class_weight 参数实际上以下列方式控制 C 参数:

class_weight : {dict, ‘balanced’}, optional

Set the parameter C of class i to class_weight[i]*C for SVC. If not given, all classes are supposed to have weight one. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

尝试在保持 C 不变的情况下使用 class_weight,例如C=0.1


编辑

这是为您的 171 类 创建 class_weight 的绝妙方法。

# store the weights for each class in a list
weights_per_class = [2,3,4,5,6]

#Let's assume that you have a `y` like this:
y = [121, 122, 123, 124, 125]

你需要:

# create the `class_weight` dictionary
class_weight = {val:weights_per_class[index] for index,val in enumerate (y)}

print(class_weight)
#{121: 2, 122: 3, 123: 4, 124: 5, 125: 6}

# Use it as argument
svm = LinearSVC(class_weight=class_weight)