使用 BaggingRegressor 返回标准偏差

Question

有没有办法使用 sklearn.ensemble.BaggingRegressor 来 return 标准偏差？

因为通过查看几个例子，我发现所有的都是 mean 预测。

Answer 1

没有内置的方法可以做到这一点，不。

estimators_ 属性以及 estimators_features_（如果您已设置 max_features < 1.0）提供了拟合估计量，因此您可以手动重现各个预测。

Answer 2

您始终可以通过集成的每个估计器获得基础预测，可以通过集成的 estimators_ 属性访问（估计器），并相应地处理这些预测（计算均值、标准差等） ).

使用 10 个 SVR 基础估计器的集合改编 documentation 中的示例：

import numpy as np
from sklearn.svm import SVR
from sklearn.ensemble import BaggingRegressor
from sklearn.datasets import make_regression

X, y = make_regression(n_samples=100, n_features=4,
                       n_informative=2, n_targets=1,
                       random_state=0, shuffle=False)
regr = BaggingRegressor(base_estimator=SVR(),
                        n_estimators=10, random_state=0).fit(X, y)


regr.predict([[0, 0, 0, 0]]) # get (mean) prediction for a single sample, [0, 0, 0, 0]
# array([-2.87202411])

# get the predictions from each individual member of the ensemble using a list comprehension:

raw_pred = [x.predict([[0, 0, 0, 0]]) for x in regr.estimators_]
raw_pred
# result:
[array([-2.13003431]),
 array([-1.96224516]),
 array([-1.90429596]),
 array([-6.90647796]),
 array([-6.21360547]),
 array([-1.84318744]),
 array([1.82285686]),
 array([4.62508622]),
 array([-5.60320499]),
 array([-8.60513286])]

# get the mean, and ensure that it is the same with the one returned above with the .predict method of the ensemble:

np.mean(raw_pred)
# -2.8720241079257436
np.mean(raw_pred) == regr.predict([[0, 0, 0, 0]]) # sanity check
# True

# get the standard deviation:
np.std(raw_pred)
# 3.865135037828279

使用 BaggingRegressor 返回标准偏差

Returning standard deviation with `BaggingRegressor`

scikit-learn

ensemble-learning