安装在管道中的洁牙器结果尚未安装

Scaler fitted in a pipeline turns out to be not fitted yet

请考虑此代码:

import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.feature_selection import RFE
from sklearn.pipeline import Pipeline

# data
train_X = pd.DataFrame(data=np.random.rand(20, 3), columns=["a", "b", "c"])
train_y = pd.Series(data=np.random.randint(0,2, 20), name="y")
test_X = pd.DataFrame(data=np.random.rand(10, 3), columns=["a", "b", "c"])
test_y = pd.Series(data=np.random.randint(0,2, 10), name="y")

# scaler
scaler = StandardScaler()

# feature selection        
p = Pipeline(steps=[("scaler0",  scaler),
            ("model", SVC(kernel="linear", C=1))])

rfe = RFE(p, n_features_to_select=2, step=1,
                  importance_getter="named_steps.model.coef_")
rfe.fit(train_X, train_y)

# apply the scaler to the test set
scaled_test = scaler.transform(test_X)

我收到这条消息:

NotFittedError: This StandardScaler instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

为什么 scaler 没有安装?

将管道或估算器传递给 RFE 时,它实际上会被 RFE 克隆并进行拟合,直到它找到与减少的特征数量最匹配的结果。

要访问此拟合估算器,您可以使用 fit_pipeline = rfe.estimator_

但请注意,这个新管道使用了顶级 n_features_to_select 功能。