在三级管道中设置 imputer 的参数

Setting the parameters of an imputer within a three levels pipeline

我是这个数据科学领域的新手,为了组织我的代码,我正在使用管道。

我尝试整理的代码片段如下:

### Preprocessing ###
# Preprocessing for numerical data
numerical_transformer = Pipeline(steps=[
                ('imputer', SimpleImputer()),
                ('scaler', StandardScaler())
])

# Preprocessing for categorical data
categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='most_frequent')),
    ('onehot', OneHotEncoder(handle_unknown='ignore', sparse=False))
])

# Bundle preprocessing for numerical and categorical data
preprocessor = ColumnTransformer(
    transformers=[
        ('num', numerical_transformer, numerical_cols),
        ('cat', categorical_transformer, categorical_cols)
    ])

### Model ###
model = XGBRegressor(objective ='reg:squarederror', n_estimators=1000, learning_rate=0.05) 

### Processing ###
# Bundle preprocessing and modeling code in a pipeline
my_pipeline = Pipeline(steps=[('preprocessor', preprocessor),
                              ('model', model)
                             ])

parameters = {}
# => How to set the parameters for one of the parts of the numerical_transformer pipeline?

# GridSearch
CV = GridSearchCV(my_pipeline, parameters, scoring = 'neg_mean_absolute_error', n_jobs= 1)

CV.fit(X_train, y_train) 

如何更改在 numerical_transformer 管道中找到的 Imputer 的参数?

谢谢,

@desernaut 指向正确的方向后,这是答案:

parameters['preprocessor__num__imputer__strategy'] = ['most_frequent','mean', 'median',]

谢谢@desernaut!