在 sklearn 中将参数传递给管道的 fit()
Passing parameters to a pipeline's fit() in sklearn
我有一个 sklearn 管道,其中 PolynomialFeatures()
和 LinearRegression()
串联。我的目标是使用多项式特征的不同 degree
来拟合数据并测量分数。以下是我使用的代码-
steps = [('polynomials',preprocessing.PolynomialFeatures()),('linreg',linear_model.LinearRegression())]
pipeline = pipeline.Pipeline(steps=steps)
scores = dict()
for i in range(2,6):
params = {'polynomials__degree': i,'polynomials__include_bias': False}
#pipeline.set_params(**params)
pipeline.fit(X_train,y=yCO_logTrain,**params)
scores[i] = pipeline.score(X_train,yCO_logTrain)
scores
我收到错误 - TypeError: fit() got an unexpected keyword argument 'degree'
。
为什么即使参数以 <estimator_name>__<parameter_name>
格式命名,也会抛出此错误?
根据sklearn.pipeline.Pipeline
documentation:
**fit_paramsdict of string -> object Parameters passed to the fit method of each step, where each parameter name is prefixed such that
parameter p for step s has key s__p.
意思是这样传递的参数直接传递给s
步.fit()
方法。如果您检查 PolynomialFeatures documentation,degree
参数用于构建 PolynomialFeatures
对象,而不是其 .fit()
方法。
如果您想在管道中为 estimators/transformators 尝试不同的超参数,您可以使用 GridSearchCV as shown here。这是来自 link:
的示例代码
from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectKBest
pipe = Pipeline([
('select', SelectKBest()),
('model', calibrated_forest)])
param_grid = {
'select__k': [1, 2],
'model__base_estimator__max_depth': [2, 4, 6, 8]}
search = GridSearchCV(pipe, param_grid, cv=5).fit(X, y)
我有一个 sklearn 管道,其中 PolynomialFeatures()
和 LinearRegression()
串联。我的目标是使用多项式特征的不同 degree
来拟合数据并测量分数。以下是我使用的代码-
steps = [('polynomials',preprocessing.PolynomialFeatures()),('linreg',linear_model.LinearRegression())]
pipeline = pipeline.Pipeline(steps=steps)
scores = dict()
for i in range(2,6):
params = {'polynomials__degree': i,'polynomials__include_bias': False}
#pipeline.set_params(**params)
pipeline.fit(X_train,y=yCO_logTrain,**params)
scores[i] = pipeline.score(X_train,yCO_logTrain)
scores
我收到错误 - TypeError: fit() got an unexpected keyword argument 'degree'
。
为什么即使参数以 <estimator_name>__<parameter_name>
格式命名,也会抛出此错误?
根据sklearn.pipeline.Pipeline
documentation:
**fit_paramsdict of string -> object Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.
意思是这样传递的参数直接传递给s
步.fit()
方法。如果您检查 PolynomialFeatures documentation,degree
参数用于构建 PolynomialFeatures
对象,而不是其 .fit()
方法。
如果您想在管道中为 estimators/transformators 尝试不同的超参数,您可以使用 GridSearchCV as shown here。这是来自 link:
的示例代码from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectKBest
pipe = Pipeline([
('select', SelectKBest()),
('model', calibrated_forest)])
param_grid = {
'select__k': [1, 2],
'model__base_estimator__max_depth': [2, 4, 6, 8]}
search = GridSearchCV(pipe, param_grid, cv=5).fit(X, y)