Hypopt hyperparameter tuning error: 'sklearn.metrics' has no attribute 'scorer'

Question

我正在尝试使用 hypopt 执行 GridSearch 以执行多分类任务。

param_grid = [{'C': [1, 10, 100],  'penalty' :['l2']}]
gs = GridSearch(model = LogisticRegression(multi_class='multinomial'), param_grid = param_grid)
gs.fit(X_train, y_train, X_val, y_val, scoring='f1_macro')

没有指定评分函数，它按预期运行。但是，当我指定评分函数时，例如到 'f1_macro'，我收到以下错误：

   0%|          | 0/3 [00:00<?, ?it/s]/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
/usr/local/lib/python3.6/dist-packages/hypopt/model_selection.py:174: UserWarning: ERROR in thread<NoDaemonProcess(NoDaemonPoolWorker-59, started)>with exception:
module 'sklearn.metrics' has no attribute 'scorer'
  warnings.warn('ERROR in thread' + pname + "with exception:\n" + str(e))


 33%|███▎      | 1/3 [00:13<00:26, 13.21s/it]/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
/usr/local/lib/python3.6/dist-packages/hypopt/model_selection.py:174: UserWarning: ERROR in thread<NoDaemonProcess(NoDaemonPoolWorker-60, started)>with exception:
module 'sklearn.metrics' has no attribute 'scorer'
  warnings.warn('ERROR in thread' + pname + "with exception:\n" + str(e))


 67%|██████▋   | 2/3 [00:13<00:09,  9.30s/it]/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
/usr/local/lib/python3.6/dist-packages/hypopt/model_selection.py:174: UserWarning: ERROR in thread<NoDaemonProcess(NoDaemonPoolWorker-59, started)>with exception:
module 'sklearn.metrics' has no attribute 'scorer'
  warnings.warn('ERROR in thread' + pname + "with exception:\n" + str(e))


100%|██████████| 3/3 [00:19<00:00,  6.59s/it]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-102-2a8cb30a1d8d> in <module>()
      7 # Grid-search all parameter combinations using a validation set.
      8 gs = GridSearch(model = LogisticRegression(multi_class='multinomial'), param_grid = param_grid)
----> 9 gs.fit(X_train, y_train, X_val, y_val, scoring='f1_macro')
     10 

/usr/local/lib/python3.6/dist-packages/hypopt/model_selection.py in fit(self, X_train, y_train, X_val, y_val, scoring, scoring_params, verbose)
    361             else:
    362                 results = [_run_thread_job(job) for job in params]
--> 363             models, scores = list(zip(*results))
    364             self.model = models[np.argmax(scores)]
    365         else:

ValueError: not enough values to unpack (expected 2, got 0)

错误也可以通过

轻松重现

X_train = np.array([[1, 2, 3], [3, 4, 5], [1, 2, 3]])
X_val = X_train
y_train = [1,0,2]
y_val = y_train

不知道发生了什么！？

我用

sklearn.__version__
>> 0.22.2.post1
hypopt.__version__
>> 1.0.9

Answer 1

hypopt 和 sklearn 版本之间存在兼容性问题，错误消息不言自明。

我有：

import hypopt
import sklearn
hypopt.__version__, sklearn.__version__
('1.0.9', '0.23.2')

我确实遇到了和你一样的错误。原因是以下 source 代码：

elif type(scoring) in [metrics.scorer._PredictScorer, metrics.scorer._ProbaScorer] \
            or metrics.scorer._PredictScorer in type(scoring).__bases__ \
            or metrics.scorer._ProbaScorer in type(scoring).__bases__:
            score = scoring(model_clone, job_params["X_val"], job_params["y_val"])

将 metrics.scorer 更改为 metrics._scorer -- 因为这是 sklearn v.23.1 所期望的 -- 你可以继续了。

证明：

from sklearn.linear_model import LogisticRegression
from hypopt import GridSearch

X_train = np.array([[1, 2, 3], [3, 4, 5], [1, 2, 3]])
X_val = X_train
y_train = [1,0,2]
y_val = y_train
param_grid = [{'C': [1, 10, 100],  'penalty' :['l2']}]
model = LogisticRegression(multi_class='multinomial')
gs = GridSearch(model = model, param_grid = param_grid, num_threads=1)
gs.fit(X_train, y_train, X_val, y_val, scoring='f1_micro')
100%|██████████| 3/3 [00:00<00:00, 32.56it/s]
LogisticRegression(C=1, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='multinomial', n_jobs=None, penalty='l2',
                   random_state=0, solver='lbfgs', tol=0.0001, verbose=0,
                   warm_start=False)

Hypopt hyperparameter tuning error: 'sklearn.metrics' has no attribute 'scorer'

Hypopt hyperparameter tuning error: 'sklearn.metrics' has no attribute 'scorer'

python

scikit-learn

grid-search