在三级管道中设置 imputer 的参数
Setting the parameters of an imputer within a three levels pipeline
我是这个数据科学领域的新手,为了组织我的代码,我正在使用管道。
我尝试整理的代码片段如下:
### Preprocessing ###
# Preprocessing for numerical data
numerical_transformer = Pipeline(steps=[
('imputer', SimpleImputer()),
('scaler', StandardScaler())
])
# Preprocessing for categorical data
categorical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='most_frequent')),
('onehot', OneHotEncoder(handle_unknown='ignore', sparse=False))
])
# Bundle preprocessing for numerical and categorical data
preprocessor = ColumnTransformer(
transformers=[
('num', numerical_transformer, numerical_cols),
('cat', categorical_transformer, categorical_cols)
])
### Model ###
model = XGBRegressor(objective ='reg:squarederror', n_estimators=1000, learning_rate=0.05)
### Processing ###
# Bundle preprocessing and modeling code in a pipeline
my_pipeline = Pipeline(steps=[('preprocessor', preprocessor),
('model', model)
])
parameters = {}
# => How to set the parameters for one of the parts of the numerical_transformer pipeline?
# GridSearch
CV = GridSearchCV(my_pipeline, parameters, scoring = 'neg_mean_absolute_error', n_jobs= 1)
CV.fit(X_train, y_train)
如何更改在 numerical_transformer 管道中找到的 Imputer 的参数?
谢谢,
@desernaut 指向正确的方向后,这是答案:
parameters['preprocessor__num__imputer__strategy'] = ['most_frequent','mean', 'median',]
谢谢@desernaut!
我是这个数据科学领域的新手,为了组织我的代码,我正在使用管道。
我尝试整理的代码片段如下:
### Preprocessing ###
# Preprocessing for numerical data
numerical_transformer = Pipeline(steps=[
('imputer', SimpleImputer()),
('scaler', StandardScaler())
])
# Preprocessing for categorical data
categorical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='most_frequent')),
('onehot', OneHotEncoder(handle_unknown='ignore', sparse=False))
])
# Bundle preprocessing for numerical and categorical data
preprocessor = ColumnTransformer(
transformers=[
('num', numerical_transformer, numerical_cols),
('cat', categorical_transformer, categorical_cols)
])
### Model ###
model = XGBRegressor(objective ='reg:squarederror', n_estimators=1000, learning_rate=0.05)
### Processing ###
# Bundle preprocessing and modeling code in a pipeline
my_pipeline = Pipeline(steps=[('preprocessor', preprocessor),
('model', model)
])
parameters = {}
# => How to set the parameters for one of the parts of the numerical_transformer pipeline?
# GridSearch
CV = GridSearchCV(my_pipeline, parameters, scoring = 'neg_mean_absolute_error', n_jobs= 1)
CV.fit(X_train, y_train)
如何更改在 numerical_transformer 管道中找到的 Imputer 的参数?
谢谢,
@desernaut 指向正确的方向后,这是答案:
parameters['preprocessor__num__imputer__strategy'] = ['most_frequent','mean', 'median',]
谢谢@desernaut!