xgboost 中的多类分类 (python)
multiclass classification in xgboost (python)
我不知道如何使用 objective 函数 'multi:softmax'.
将 类 的数量或评估指标传递给 xgb.XGBClassifier
我看了很多文档,但唯一谈论的是接受 n_class/num_class 的 sklearn 包装器。
我目前的设置看起来像
kf = cross_validation.KFold(y_data.shape[0], \
n_folds=10, shuffle=True, random_state=30)
err = [] # to hold cross val errors
# xgb instance
xgb_model = xgb.XGBClassifier(n_estimators=_params['n_estimators'], \
max_depth=params['max_depth'], learning_rate=_params['learning_rate'], \
min_child_weight=_params['min_child_weight'], \
subsample=_params['subsample'], \
colsample_bytree=_params['colsample_bytree'], \
objective='multi:softmax', nthread=4)
# cv
for train_index, test_index in kf:
xgb_model.fit(x_data[train_index], y_data[train_index], eval_metric='mlogloss')
predictions = xgb_model.predict(x_data[test_index])
actuals = y_data[test_index]
err.append(metrics.accuracy_score(actuals, predictions))
XGBoost 分类不需要在 scikit-learn API 中设置 num_class
。它在调用 fit
时自动完成。看XGBClassifier
的fit
方法开头的xgboost/sklearn.py:
evals_result = {}
self.classes_ = np.unique(y)
self.n_classes_ = len(self.classes_)
xgb_options = self.get_xgb_params()
if callable(self.objective):
obj = _objective_decorator(self.objective)
# Use default value. Is it really not used ?
xgb_options["objective"] = "binary:logistic"
else:
obj = None
if self.n_classes_ > 2:
# Switch to using a multiclass objective in the underlying XGB instance
xgb_options["objective"] = "multi:softprob"
xgb_options['num_class'] = self.n_classes_
我不知道如何使用 objective 函数 'multi:softmax'.
将 类 的数量或评估指标传递给 xgb.XGBClassifier我看了很多文档,但唯一谈论的是接受 n_class/num_class 的 sklearn 包装器。
我目前的设置看起来像
kf = cross_validation.KFold(y_data.shape[0], \
n_folds=10, shuffle=True, random_state=30)
err = [] # to hold cross val errors
# xgb instance
xgb_model = xgb.XGBClassifier(n_estimators=_params['n_estimators'], \
max_depth=params['max_depth'], learning_rate=_params['learning_rate'], \
min_child_weight=_params['min_child_weight'], \
subsample=_params['subsample'], \
colsample_bytree=_params['colsample_bytree'], \
objective='multi:softmax', nthread=4)
# cv
for train_index, test_index in kf:
xgb_model.fit(x_data[train_index], y_data[train_index], eval_metric='mlogloss')
predictions = xgb_model.predict(x_data[test_index])
actuals = y_data[test_index]
err.append(metrics.accuracy_score(actuals, predictions))
XGBoost 分类不需要在 scikit-learn API 中设置 num_class
。它在调用 fit
时自动完成。看XGBClassifier
的fit
方法开头的xgboost/sklearn.py:
evals_result = {}
self.classes_ = np.unique(y)
self.n_classes_ = len(self.classes_)
xgb_options = self.get_xgb_params()
if callable(self.objective):
obj = _objective_decorator(self.objective)
# Use default value. Is it really not used ?
xgb_options["objective"] = "binary:logistic"
else:
obj = None
if self.n_classes_ > 2:
# Switch to using a multiclass objective in the underlying XGB instance
xgb_options["objective"] = "multi:softprob"
xgb_options['num_class'] = self.n_classes_