使用管道 sklearn 中的 fit_params 进行训练

using fit_params from pipeline sklearn for training

我在 sklearn 的 Pipeline 中使用 xgboost 库中的 XGBClassifier,但每当我想以库所说的方式访问 **fit_params 之一这样做 https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline.fit 我得到了 keyerrors

xgb_model = XGBClassifier(eval_metric='logloss', use_label_encoder=False)
pipeline = Pipeline([("preproc", preprocesser), ("classifier", xgb_model)])
pipeline.fit(
    X_train, y_train, train_model__eval_set=[(X_valid_transformed, y_valid)]
)

我得到了

Keyerror: 'train_model'

来自sklearn.pipeline docs

...
**fit_paramsdict of string -> object
  Parameters passed to the fit method of each step,
  where each parameter name is prefixed such that
  parameter p for step s has key s__p
...

因此,对于您的代码,您需要:

                                                      |
                                                      |
                                                      v
                                                   ________
                                                  |        |
pipeline = Pipeline([("preproc", preprocesser), ("classifier", xgb_model)])
pipeline.fit(
    X_train, y_train, classifier__eval_set=[(X_valid_transformed, y_valid)]
)                     |________|
                          ^
                          |
                          |