如何在 Scikit-Learn 中绘制超过 10 倍交叉验证的 PR 曲线

Question

我正在运行一些针对二进制预测问题的监督实验。我使用 10 折交叉验证来评估平均精度的性能（每折的平均精度除以交叉验证的折数 - 在我的例子中为 10）。我想绘制这 10 次折叠的平均精度结果的 PR 曲线，但是我不确定执行此操作的最佳方法。

A previous question in the Cross Validated Stack Exchange site raised this same problem. A comment recommended working through this example 关于从 Scikit-Learn 站点绘制跨交叉验证折叠的 ROC 曲线，并将其调整为平均精度。这是我为尝试这个想法而修改的相关代码部分：

from scipy import interp
# Other packages/functions are imported, but not crucial to the question
max_ent = LogisticRegression()

mean_precision = 0.0
mean_recall = np.linspace(0,1,100)
mean_average_precision = []

for i in set(folds):
    y_scores = max_ent.fit(X_train, y_train).decision_function(X_test)
    precision, recall, _ = precision_recall_curve(y_test, y_scores)
    average_precision = average_precision_score(y_test, y_scores)
    mean_average_precision.append(average_precision)
    mean_precision += interp(mean_recall, recall, precision)

# After this line of code, inspecting the mean_precision array shows that 
# the majority of the elements equal 1. This is the part that is confusing me
# and is contributing to the incorrect plot.
mean_precision /= len(set(folds))
# This is what the actual MAP score should be
mean_average_precision = sum(mean_average_precision) / len(mean_average_precision)

# Code for plotting the mean average precision curve across folds
plt.plot(mean_recall, mean_precision)
plt.title('Mean AP Over 10 folds (area=%0.2f)' % (mean_average_precision))
plt.show()

代码运行，但在我的例子中，平均精度曲线不正确。出于某种原因，我指定用于存储 mean_precision 分数（ROC 示例中的 mean_tpr 变量）的数组计算出第一个元素接近零，所有其他元素在除以折叠次数。下面是 mean_precision 分数与 mean_recall 分数的对比图。如您所见，绘图跳转到 1，这是不准确的。所以我的直觉是在交叉验证的每一次折叠中 mean_precision (mean_precision += interp(mean_recall, recall, precision) ) 的更新都会出错，但目前还不清楚如何解决这个问题。任何指导或帮助将不胜感激。

Answer 1

我在其他讨论中找不到答案，希望这能有所帮助。主要是在使用 interp:

之前反转召回率和精度

reversed_recall = np.fliplr([recall])[0]
reversed_precision = np.fliplr([precision])[0]
reversed_mean_precision += interp(mean_recall, reversed_recall, reversed_precision)
reversed_mean_precision[0] = 0.0

并确保在绘图时反转：

reversed_mean_precision /= FOLDS
reversed_mean_precision[0] = 1
mean_auc_pr = auc(mean_recall, reversed_mean_precision)
plt.plot(mean_recall,  np.fliplr([reversed_mean_precision])[0], 'k--',
         label='Mean precision (area = %0.2f)' % mean_auc_pr, lw=2)

完整代码在这里：

FOLDS = 10
AUCs = []
AUCs_proba = []

precision_combined = []
recall_combined = []
thresholds_combined = []

X_ = pred_features.as_matrix()
Y_ = pred_true.as_matrix()

k_fold = cross_validation.KFold(n=len(pred_features), n_folds=FOLDS,shuffle=True,random_state=None)
clf = svm.SVC(kernel='linear', C = 1.0)
mean_tpr = 0.0
mean_fpr = np.linspace(0, 1, 100)
all_tpr = []
reversed_mean_precision = 0.0
mean_recall = np.linspace(0, 1, 100)
all_precision = []

for train_index, test_index in k_fold:
    xtrain, xtest = pred_features.iloc[train_index], pred_features.iloc[test_index]
    ytrain, ytest = pred_true[train_index], pred_true[test_index]
    test_prob = clf.fit(xtrain,ytrain).predict(xtest)
    precision, recall, thresholds = metrics.precision_recall_curve(ytest, test_prob, pos_label=2)
    reversed_recall = np.fliplr([recall])[0]
    reversed_precision = np.fliplr([precision])[0]
    reversed_mean_precision += interp(mean_recall, reversed_recall, reversed_precision)
    reversed_mean_precision[0] = 0.0

    AUCs.append(metrics.auc(recall, precision))

plt.plot([0, 1], [0, 1], '--', color=(0.6, 0.6, 0.6), label='Luck')

reversed_mean_precision /= FOLDS
reversed_mean_precision[0] = 1
mean_auc_pr = auc(mean_recall, reversed_mean_precision)
plt.plot(mean_recall,  np.fliplr([reversed_mean_precision])[0], 'k--',
         label='Mean precision (area = %0.2f)' % mean_auc_pr, lw=2)

plt.xlim([0, 1])
plt.ylim([0, 1])
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision Recall')
plt.legend(loc="lower right")
plt.show()
print "AUCs: "
print  sum(AUCs) / float(len(AUCs))

Answer 2

我遇到了同样的问题。这是我的解决方案：在循环之后，我计算所有折叠结果的 precision_recall_curve，而不是对折叠进行平均。根据 https://stats.stackexchange.com/questions/34611/meanscores-vs-scoreconcatenation-in-cross-validation 中的讨论，这是一种通常更可取的方法。

import matplotlib.pyplot as plt
import numpy
from sklearn.datasets import make_blobs
from sklearn.metrics import precision_recall_curve, auc
from sklearn.model_selection import KFold
from sklearn.svm import SVC

FOLDS = 5

X, y = make_blobs(n_samples=1000, n_features=2, centers=2, cluster_std=10.0,
    random_state=12345)

f, axes = plt.subplots(1, 2, figsize=(10, 5))

axes[0].scatter(X[y==0,0], X[y==0,1], color='blue', s=2, label='y=0')
axes[0].scatter(X[y!=0,0], X[y!=0,1], color='red', s=2, label='y=1')
axes[0].set_xlabel('X[:,0]')
axes[0].set_ylabel('X[:,1]')
axes[0].legend(loc='lower left', fontsize='small')

k_fold = KFold(n_splits=FOLDS, shuffle=True, random_state=12345)
predictor = SVC(kernel='linear', C=1.0, probability=True, random_state=12345)

y_real = []
y_proba = []
for i, (train_index, test_index) in enumerate(k_fold.split(X)):
    Xtrain, Xtest = X[train_index], X[test_index]
    ytrain, ytest = y[train_index], y[test_index]
    predictor.fit(Xtrain, ytrain)
    pred_proba = predictor.predict_proba(Xtest)
    precision, recall, _ = precision_recall_curve(ytest, pred_proba[:,1])
    lab = 'Fold %d AUC=%.4f' % (i+1, auc(recall, precision))
    axes[1].step(recall, precision, label=lab)
    y_real.append(ytest)
    y_proba.append(pred_proba[:,1])

y_real = numpy.concatenate(y_real)
y_proba = numpy.concatenate(y_proba)
precision, recall, _ = precision_recall_curve(y_real, y_proba)
lab = 'Overall AUC=%.4f' % (auc(recall, precision))
axes[1].step(recall, precision, label=lab, lw=2, color='black')
axes[1].set_xlabel('Recall')
axes[1].set_ylabel('Precision')
axes[1].legend(loc='lower left', fontsize='small')

f.tight_layout()
f.savefig('result.png')

Answer 3

添加到@Dietmar 的回答，我同意它大部分是正确的，除了不使用 sklearn.metrics.auc 来计算精确召回曲线下的面积，我认为我们应该使用 sklearn.metrics.average_precision_score.

支持文献：

Davis, J., & Goadrich, M.（2006 年 6 月）。 The relationship between Precision-Recall and ROC curves. 第 23 届机器学习国际会议论文集（第 233-240 页）。

For example, in PR space it is incorrect to linearly interpolate between points

Boyd, K., Eng, K. H., & Page, C. D.（2013 年 9 月）。 Area under the precision-recall curve: point estimates and confidence intervals. 在关于数据库中的机器学习和知识发现的欧洲联合会议上（第 451-466 页）。斯普林格，柏林，海德堡。

We provide evidence in favor of computing AUCPR using the lower trapezoid, average precision, or interpolated median estimators

来自sklearn's documentation on average_precision_score

This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic.

这是一个完全可重现的例子，我希望它能帮助其他人，如果他们穿过这个线程：

import matplotlib.pyplot as plt
import numpy as np
from numpy import interp
import pandas as pd
from sklearn.datasets import make_blobs
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, auc, average_precision_score, confusion_matrix, roc_curve, precision_recall_curve
from sklearn.model_selection import KFold, train_test_split, RandomizedSearchCV, StratifiedKFold
from sklearn.svm import SVC

%matplotlib inline

def draw_cv_roc_curve(classifier, cv, X, y, title='ROC Curve'):
    """
    Draw a Cross Validated ROC Curve.
    Args:
        classifier: Classifier Object
        cv: StratifiedKFold Object: (https://stats.stackexchange.com/questions/49540/understanding-stratified-cross-validation)
        X: Feature Pandas DataFrame
        y: Response Pandas Series
    Example largely taken from http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc_crossval.html#sphx-glr-auto-examples-model-selection-plot-roc-crossval-py
    """
    # Creating ROC Curve with Cross Validation
    tprs = []
    aucs = []
    mean_fpr = np.linspace(0, 1, 100)

    i = 0
    for train, test in cv.split(X, y):
        probas_ = classifier.fit(X.iloc[train], y.iloc[train]).predict_proba(X.iloc[test])
        # Compute ROC curve and area the curve
        fpr, tpr, thresholds = roc_curve(y.iloc[test], probas_[:, 1])
        tprs.append(interp(mean_fpr, fpr, tpr))
        
        tprs[-1][0] = 0.0
        roc_auc = auc(fpr, tpr)
        aucs.append(roc_auc)
        plt.plot(fpr, tpr, lw=1, alpha=0.3,
                 label='ROC fold %d (AUC = %0.2f)' % (i, roc_auc))

        i += 1
    plt.plot([0, 1], [0, 1], linestyle='--', lw=2, color='r',
             label='Luck', alpha=.8)
    
    mean_tpr = np.mean(tprs, axis=0)
    mean_tpr[-1] = 1.0
    mean_auc = auc(mean_fpr, mean_tpr)
    std_auc = np.std(aucs)
    plt.plot(mean_fpr, mean_tpr, color='b',
             label=r'Mean ROC (AUC = %0.2f $\pm$ %0.2f)' % (mean_auc, std_auc),
             lw=2, alpha=.8)

    std_tpr = np.std(tprs, axis=0)
    tprs_upper = np.minimum(mean_tpr + std_tpr, 1)
    tprs_lower = np.maximum(mean_tpr - std_tpr, 0)
    plt.fill_between(mean_fpr, tprs_lower, tprs_upper, color='grey', alpha=.2,
                     label=r'$\pm$ 1 std. dev.')

    plt.xlim([-0.05, 1.05])
    plt.ylim([-0.05, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title(title)
    plt.legend(loc="lower right")
    plt.show()


def draw_cv_pr_curve(classifier, cv, X, y, title='PR Curve'):
    """
    Draw a Cross Validated PR Curve.
    Keyword Args:
        classifier: Classifier Object
        cv: StratifiedKFold Object: (https://stats.stackexchange.com/questions/49540/understanding-stratified-cross-validation)
        X: Feature Pandas DataFrame
        y: Response Pandas Series
        
    Largely taken from: https://whosebug.com/questions/29656550/how-to-plot-pr-curve-over-10-folds-of-cross-validation-in-scikit-learn
    """
    y_real = []
    y_proba = []

    i = 0
    for train, test in cv.split(X, y):
        probas_ = classifier.fit(X.iloc[train], y.iloc[train]).predict_proba(X.iloc[test])
        # Compute ROC curve and area the curve
        precision, recall, _ = precision_recall_curve(y.iloc[test], probas_[:, 1])
        
        # Plotting each individual PR Curve
        plt.plot(recall, precision, lw=1, alpha=0.3,
                 label='PR fold %d (AUC = %0.2f)' % (i, average_precision_score(y.iloc[test], probas_[:, 1])))
        
        y_real.append(y.iloc[test])
        y_proba.append(probas_[:, 1])

        i += 1
    
    y_real = np.concatenate(y_real)
    y_proba = np.concatenate(y_proba)
    
    precision, recall, _ = precision_recall_curve(y_real, y_proba)

    plt.plot(recall, precision, color='b',
             label=r'Precision-Recall (AUC = %0.2f)' % (average_precision_score(y_real, y_proba)),
             lw=2, alpha=.8)

    plt.xlim([-0.05, 1.05])
    plt.ylim([-0.05, 1.05])
    plt.xlabel('Recall')
    plt.ylabel('Precision')
    plt.title(title)
    plt.legend(loc="lower right")
    plt.show()

# Create a fake example where X is an 1000 x 2 Matrix
# Y is 1000 x 1 vector
# Binary Classification Problem
FOLDS = 5

X, y = make_blobs(n_samples=1000, n_features=2, centers=2, cluster_std=10.0,
    random_state=12345)

X = pd.DataFrame(X)
y = pd.DataFrame(y)

f, axes = plt.subplots(1, 2, figsize=(10, 5))

X.loc[y.iloc[:, 0] == 1]

axes[0].scatter(X.loc[y.iloc[:, 0] == 0, 0], X.loc[y.iloc[:, 0] == 0, 1], color='blue', s=2, label='y=0')
axes[0].scatter(X.loc[y.iloc[:, 0] !=0, 0], X.loc[y.iloc[:, 0] != 0, 1], color='red', s=2, label='y=1')
axes[0].set_xlabel('X[:,0]')
axes[0].set_ylabel('X[:,1]')
axes[0].legend(loc='lower left', fontsize='small')

# Setting up simple RF Classifier
clf = RandomForestClassifier()

# Set up Stratified K Fold
cv = StratifiedKFold(n_splits=6)

draw_cv_roc_curve(clf, cv, X, y, title='Cross Validated ROC')

draw_cv_pr_curve(clf, cv, X, y, title='Cross Validated PR Curve')

如何在 Scikit-Learn 中绘制超过 10 倍交叉验证的 PR 曲线

How to Plot PR-Curve Over 10 folds of Cross Validation in Scikit-Learn

python

plot

machine-learning

scikit-learn

cross-validation