sklearn 中的交叉验证和标准化

Cross validation and standaridization in skitlearn

我想找到带有 K 交叉验证的 sklearn 分类器的准确性。我可以在没有交叉验证的情况下正常估计准确性。但是,如何改进此代码以进行交叉验证并同时应用 StandardScaler?

from sklearn.datasets import load_iris
from sklearn.cross_validation import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn import metrics
from sklearn.cross_validation import cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn import svm
from sklearn.pipeline import Pipeline
iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=4)
pipe_lrSVC = Pipeline([('scaler', StandardScaler()), ('clf', svm.LinearSVC())])
pipe_lrSVC.fit(X_train, y_train)
y_pred = pipe_lrSVC.predict(X_test)
print(metrics.accuracy_score(y_test, y_pred))

只需使用管道作为 cross_val_score 的估算器输入:

cross_val_score(pipe_lrSVC, iris.data, iris.target, cv=5)