XGBoost gpu 无法使用 scikit RandomizedSearchCV 运行
XGBoost gpu fails to run with scikit RandomizedSearchCV
XGBoost 在 cpu 和 gpu 上都能正常工作,但是一旦我添加 scikit 的 randomizedsearchcv 进行超参数调整,它就会失败。
系统:Ubuntu20
环境:conda 虚拟环境 python 3.7
xgboost 安装:conda install -c anaconda py-xgboost-gpu
代码:
from sklearn.model_selection import cross_val_score, RandomizedSearchCV, train_test_split
import xgboost as xgb
from scipy.stats import uniform, randint
xgb_model = xgb.XGBRegressor(objective="reg:squarederror")
params = {}
params['eval_metric'] = 'rmse'
params['tree_method'] = 'gpu_hist'
params['colsample_bytree'] = uniform(0.7, 0.3)
params['gamma'] = uniform(0, 0.5)
params['learning_rate'] = uniform(0.03, 0.3)
params['max_depth'] = randint(2,6)
params['n_estimators'] = randint(100, 150)
params['subsample'] = uniform(0.6, 0.4)
search = RandomizedSearchCV(xgb_model, param_distributions=params, random_state=42, n_iter=200, cv=3, verbose=1, return_train_score=True) #n_jobs=8,
search.fit(X_train, y_train)
print(search)
错误:
Fitting 3 folds for each of 200 candidates, totalling 600 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/sklearn/model_selection/_validation.py:552: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 531, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/sklearn.py", line 396, in fit
callbacks=callbacks)
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/training.py", line 216, in train
xgb_model=xgb_model, callbacks=callbacks)
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/training.py", line 74, in _train_internal
bst.update(dtrain, i, obj)
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/core.py", line 1109, in update
dtrain.handle))
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/core.py", line 176, in _check_call
raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: Invalid Input: 's', valid values are: {'approx', 'auto', 'exact', 'gpu_exact', 'gpu_hist', 'hist'}
谢谢大家
param_distribution
参数需要是列表/数组的字典。
当前代码将您输入的 eval_metric
和 tree_method
参数解释为
params['eval_metric'] = ['r', 'm', 's', 'e']
params['tree_method'] = ['g', 'p', 'u', '_', 'h', 'i', 's', 't']
要修复它,您需要将相关行替换为
params['eval_metric'] = ['rmse']
params['tree_method'] = ['gpu_hist']
XGBoost 在 cpu 和 gpu 上都能正常工作,但是一旦我添加 scikit 的 randomizedsearchcv 进行超参数调整,它就会失败。
系统:Ubuntu20
环境:conda 虚拟环境 python 3.7
xgboost 安装:conda install -c anaconda py-xgboost-gpu
代码:
from sklearn.model_selection import cross_val_score, RandomizedSearchCV, train_test_split
import xgboost as xgb
from scipy.stats import uniform, randint
xgb_model = xgb.XGBRegressor(objective="reg:squarederror")
params = {}
params['eval_metric'] = 'rmse'
params['tree_method'] = 'gpu_hist'
params['colsample_bytree'] = uniform(0.7, 0.3)
params['gamma'] = uniform(0, 0.5)
params['learning_rate'] = uniform(0.03, 0.3)
params['max_depth'] = randint(2,6)
params['n_estimators'] = randint(100, 150)
params['subsample'] = uniform(0.6, 0.4)
search = RandomizedSearchCV(xgb_model, param_distributions=params, random_state=42, n_iter=200, cv=3, verbose=1, return_train_score=True) #n_jobs=8,
search.fit(X_train, y_train)
print(search)
错误:
Fitting 3 folds for each of 200 candidates, totalling 600 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/sklearn/model_selection/_validation.py:552: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 531, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/sklearn.py", line 396, in fit
callbacks=callbacks)
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/training.py", line 216, in train
xgb_model=xgb_model, callbacks=callbacks)
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/training.py", line 74, in _train_internal
bst.update(dtrain, i, obj)
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/core.py", line 1109, in update
dtrain.handle))
File "/home/polabs1/anaconda3/envs/PoEnv_XGB_gpu/lib/python3.7/site-packages/xgboost/core.py", line 176, in _check_call
raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: Invalid Input: 's', valid values are: {'approx', 'auto', 'exact', 'gpu_exact', 'gpu_hist', 'hist'}
谢谢大家
param_distribution
参数需要是列表/数组的字典。
当前代码将您输入的 eval_metric
和 tree_method
参数解释为
params['eval_metric'] = ['r', 'm', 's', 'e']
params['tree_method'] = ['g', 'p', 'u', '_', 'h', 'i', 's', 't']
要修复它,您需要将相关行替换为
params['eval_metric'] = ['rmse']
params['tree_method'] = ['gpu_hist']