使用 Scikit-Learn 的 GridSearchCV 捕获所有排列的精度、召回率和 f1？

Question

我想使用 Scikit-Learn 的 GridSearchCV 进行运行一堆实验，然后打印出每个实验的召回率、精确率和 f1。

这篇文章（https://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html）提示我需要多次运行.fit和.predict

...
scores = ['precision', 'recall']
...
for score in scores:
    ...
    clf = GridSearchCV(
        SVC(), tuned_parameters, scoring='%s_macro' % score
    )
    clf.fit(X_train, y_train) # running for each scoring metric
    ...
    for mean, std, params in zip(means, stds, clf.cv_results_['params']):
        print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))
    ...
    y_true, y_pred = y_test, clf.predict(X_test) # running for each scoring metric
    print(classification_report(y_true, y_pred))

我只想运行 .fit 一次并记录所有召回率、精确率和 f1 指标。因此，例如，类似以下内容的内容：

clf = GridSearchCV(
    SVC(), tuned_parameters, scoring=['recall', 'precision', 'f1'] # I don't think this syntax is even possible
)

clf.fit(X_train, y_train)

for metric in clf.something_that_i_cannot_find:
    ### does something like this exist?
    print(metric['precision']
    print(metric['recall'])
    print(metric['f1'])
    ###:end does something like this exist?

或者甚至：

...
for run in clf.something_that_i_cannot_find:
    ### does something like this exist?
    print(classification_report(run.y_true, run.y_pred))
    ###:end does something like this exist?

这篇文章 () 建议可以让 GridSearchCV 了解多个评分者，但我仍然无法弄清楚如何访问所有实验的每个评分。

GridSearchCV 不支持我正在查找的内容吗？文章中使用的方法（即运行多次 .fit 和 .predict）是完成与我要求的类似的事情的最简单方法吗？

感谢您的宝贵时间

Answer 1

您将不得不手动执行此操作，这将需要大量代码来使用 sklearn 和另一个参数的多个循环来循环折叠。我建议为折叠策略、网格搜索和模型设置随机状态，并且运行每个指标的网格搜索 3 次。

Answer 2

您可以对二元分类进行多指标评估。我在 iris dataset.

上尝试实施时遇到了 ValueError: Multi-class not supported

我已经在下面的基本二进制数据上实现了，我正在计算四个不同的分数，

['AUC', 'F1', 'Precision', 'Recall']

注意：这个想法不是要使用模型的推论，而只是为了展示多指标评估的工作原理。数据只是随机数据。

X, y = datasets.make_classification(n_classes=2, random_state=0)

# The scorers can be either one of the predefined metric strings or a scorer
# callable, like the one returned by make_scorer
f1_scorer = make_scorer(f1_score, average='binary')
scoring = {'AUC': 'roc_auc', 'F1': 'f1_micro', 'Precision': 'precision', 'Recall':'recall'}

# split data to train and test data
X_train, X_test, y_train, y_test =  train_test_split(X, y, test_size=0.2)

clf = GridSearchCV(
              SVC(),
              param_grid={'kernel': ['linear'], 'C': [1, 10, 100, 1000]},
              scoring=scoring,
              refit='AUC',
              return_train_score=True
               )
clf.fit(X_train, y_train)
results = clf.cv_results_


**Plotting the result**

plt.figure(figsize=(10, 10))
plt.title("GridSearchCV evaluating using multiple scorers simultaneously",
      fontsize=16)

plt.xlabel("min_samples_split")
plt.ylabel("Score")

ax = plt.gca()
ax.set_xlim(1, 1000)
ax.set_ylim(0.40, 1)

# Get the regular numpy array from the MaskedArray
X_axis = np.array(results['param_C'].data, dtype=float)

for scorer, color in zip(sorted(scoring), ['g', 'k', 'b', 'r']):
    for sample, style in (('train', '--'), ('test', '-')):
       sample_score_mean = results['mean_%s_%s' % (sample, scorer)]
       sample_score_std = results['std_%s_%s' % (sample, scorer)]
       ax.fill_between(X_axis, sample_score_mean - sample_score_std,
                    sample_score_mean + sample_score_std,
                    alpha=0.1 if sample == 'test' else 0, color=color)
       ax.plot(X_axis, sample_score_mean, style, color=color,
            alpha=1 if sample == 'test' else 0.7,
            label="%s (%s)" % (scorer, sample))

    best_index = np.nonzero(results['rank_test_%s' % scorer] == 1)[0][0]
    best_score = results['mean_test_%s' % scorer][best_index]

    # Plot a dotted vertical line at the best score for that scorer marked by x
    ax.plot([X_axis[best_index], ] * 2, [0, best_score],
        linestyle='-.', color=color, marker='x', markeredgewidth=3, ms=8)

    # Annotate the best score for that scorer
    ax.annotate("%0.2f" % best_score,
            (X_axis[best_index], best_score + 0.005))

plt.legend(loc="best")
plt.grid(False)
plt.show()

输出图

使用 Scikit-Learn 的 GridSearchCV 捕获所有排列的精度、召回率和 f1？

Use Scikit-Learn's GridSearchCV to capture precision, recall, and f1 for all permutations?

python

metrics

scikit-learn

grid-search

gridsearchcv