如何通过 for 循环和函数进行网格搜索?

how to do a gridsearch through a for loop and a function?

我定义了一个以不同的 ML 模型为参数的函数,例如支持向量机或 XGBoost。由于每个方法都需要有最佳值,所以我不得不使用网格搜索方法,但我不知道如何将网格搜索嵌入到我的函数中。

这是我的全部功能:

def define_model(model_type):
    if model_type == "linear":
        model = LinearRegression()
    elif model_type == "mlp":
        model = MLPRegressor(
        hidden_layer_sizes=(2),
        activation='identity',
        solver='lbfgs',
        max_iter=100000000000)
    elif model_type == 'KN':
        model = KNeighborsRegressor(
        n_neighbors=6, weights='distance', 
        algorithm='ball_tree', leaf_size=60, p=2,
        metric='minkowski', metric_params=None,
        n_jobs=None)
    elif model_type == 'Dec_tree':
        model = DecisionTreeRegressor( criterion='squared_error', splitter='best', max_depth=33, 
                              min_samples_split=2,
                              min_samples_leaf=1, min_weight_fraction_leaf=0.0,max_features=None, 
                              random_state=None, max_leaf_nodes=None, 
                              min_impurity_decrease=0.0, ccp_alpha=0.0)
    elif model_type == 'SV':
        model = SVR( kernel='rbf', degree=5, gamma='scale', coef0=0.0, tol=0.001, C=10.0, 
                     epsilon=0.1, shrinking=True, cache_size=200, verbose=False, max_iter=- 1)
        
    elif model_type == 'XGBoost':
        model = xg.XGBRegressor(objective ='reg:linear',
        n_estimators = 600, seed = 123)

        
    return model

现在如果我想对 SVmlp 进行网格搜索,应该像下面这样写:

    elif model_type == 'clr_svr':
        svr = SVR()

        parameters_svr = {'kernel': ('linear', 'rbf','poly'), 'C':[1.5, 10],'gamma': [1e-7, 1e-4],'epsilon':[0.1,0.2,0.5,0.3]}

        clr_svr = GridSearchCV(svr, parameters_svr)
        
    return model

但是没有成功,报错

UnboundLocalError: local variable 'parameters_svr' referenced before assignment

而我在函数前后给出了 param list。你能帮帮我吗?

只需在单独的单元格中定义 parameter list 并像这样调用它即可:

KNN = KNeighborsRegressor()

param_list = {'n_neighbors':[5,10,15,20,25,30,40,50],'weights': ('uniform','distance'),'algorithm':('ball_tree','kd_tree','brute'),'leaf_size':[30,35,40,45,50],'p':[2,3,4,5],'n_jobs':[1,2,3]}

然后,一旦您为模型定义了函数,只需调用 gridsearch 即可。如下所示:

....
    elif model_type == 'KN':
        model = GridSearchCV(estimator=KNN,  param_grid=param_list)
....