运行 用于随机森林分类器的任何 BayesSearchCV 函数时出错

Error when running any BayesSearchCV Function for randomforest classifier

我正在尝试使用 RF 分类器,但每次我尝试 运行 bayessearchCV 函数时,都会返回错误。附件是我的具体示例和一个您可以 运行 并重现的示例。 我怀疑这可能是由于 train_test_split 函数造成的,但我不完全确定如何对其进行分类。如果我的代码中有任何明显错误的地方,请告诉我...

我目前使用的是 sklearn/skopt/numpy 等的最新版本

import numpy as np
import pandas as pd
from sklearn import preprocessing
from matplotlib import pyplot as plt
import xgboost as xgb
import sklearn
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import ElasticNet
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score 
from sklearn.metrics import roc_auc_score
from skopt import BayesSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
opt = BayesSearchCV(
    RandomForestClassifier(random_state=42),
    {
        'n_estimators': (5,5000),
        'max_features': ['auto','sqrt'],
        'max_depth': (2,90),
        'min_samples_split': (2,10),
        'min_samples_leaf': (1,7),
        'bootstrap': ["True","False"]
    },
    n_iter=32,
    cv=3,
    scoring='roc_auc'
)
opt.fit(full_train, full_y_train)

print("val. score: %s" % opt.best_score_)
print("test score: %s" % opt.score(X_test_red, y_test))

错误

/Users/user/opt/anaconda3/lib/python3.8/site-packages/sklearn/utils/deprecation.py:67: FutureWarning: Class MaskedArray is deprecated; MaskedArray is deprecated in version 0.23 and will be removed in version 0.25. Use numpy.ma.MaskedArray instead.
  warnings.warn(msg, category=FutureWarning)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-8b1596e90c35> in <module>
----> 1 opt.fit(full_train, full_y_train)
      2 
      3 print("val. score: %s" % opt.best_score_)
      4 print("test score: %s" % opt.score(X_test_red, y_test))

~/opt/anaconda3/lib/python3.8/site-packages/skopt/searchcv.py in fit(self, X, y, groups, callback)

~/opt/anaconda3/lib/python3.8/site-packages/skopt/searchcv.py in _step(self, X, y, search_space, optimizer, groups, n_points)

~/opt/anaconda3/lib/python3.8/site-packages/skopt/searchcv.py in _fit(self, X, y, groups, parameter_iterable)

~/opt/anaconda3/lib/python3.8/site-packages/sklearn/utils/deprecation.py in wrapped(*args, **kwargs)
     66         def wrapped(*args, **kwargs):
     67             warnings.warn(msg, category=FutureWarning)
---> 68             return init(*args, **kwargs)
     69         cls.__init__ = wrapped
     70 

TypeError: object.__init__() takes exactly one argument (the instance to initialize)

一个给你重现

from skopt import BayesSearchCV
from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split

X, y = load_digits(10, True)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.75, test_size=.25, random_state=0)

# log-uniform: understand as search over p = exp(x) by varying x
opt = BayesSearchCV(
    SVC(),
    {
        'C': (1e-6, 1e+6, 'log-uniform'),
        'gamma': (1e-6, 1e+1, 'log-uniform'),
        'degree': (1, 8),  # integer valued parameter
        'kernel': ['linear', 'poly', 'rbf'],  # categorical parameter
    },
    n_iter=32,
    cv=3
)

opt.fit(X_train, y_train)

print("val. score: %s" % opt.best_score_)
print("test score: %s" % opt.score(X_test, y_test))

这给出了与我机器上的第一个示例相同的错误。

事实证明,目前只能通过使用 sklearn 0.23.0 中的解决方法来解决此问题

from numpy.ma import MaskedArray
import sklearn.utils.fixes

sklearn.utils.fixes.MaskedArray = MaskedArray

import skopt

和 运行 那里的代码。就我而言,我无法使用 conda 安装旧版本的 scikit-learn,所以我很不走运,直到其中一个更新了软件包。

sklearn >= 0.23.0 的问题已在 skopt 版本 0.8.1 中修复。 https://pypi.org/project/scikit-optimize/0.8.1/