LDA 在 Python 中作为 AdaBoost 的基础学习器
LDA as base learner for AdaBoost in Python
我正在使用 AdaBoost 进行多 class class 化,将基础学习器作为判别式(线性或二次)。我在 scikit-learn 或任何其他库中找不到任何功能来实现这个,我该怎么做?
尽管 scikit-learn 的 AdaBoostClassifier
允许您选择一个基本估算器(参见 documentation), it requires the estimator to support sample_weight
. Take a look at the source:
if not has_fit_parameter(self.base_estimator_, "sample_weight"):
raise ValueError("%s doesn't support sample_weight."
% self.base_estimator_.__class__.__name__)
不幸的是,LinearDiscriminantAnalysis
和 QuadraticDiscriminantAnalysis
都不属于这一类。这是一个玩具示例:
from sklearn.ensemble import AdaBoostClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis as QDA
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target)
clf = AdaBoostClassifier(base_estimator=LDA())
clf.fit(X_train, y_train)
您会看到如下回溯:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 411, in fit
return super(AdaBoostClassifier, self).fit(X, y, sample_weight)
File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 128, in fit
self._validate_estimator()
File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 429, in _validate_estimator
% self.base_estimator_.__class__.__name__)
ValueError: LinearDiscriminantAnalysis doesn't support sample_weight.
这是一项您不会在 scikit-learn 中绕过的要求。文档清楚地表明这是一项硬性要求:
"...Support for sample weighting is required, as well as proper classes_
and n_classes_
attributes."
但是,如果您只想使用集成,则始终可以使用 bagging 而不是 boosting:
from sklearn.ensemble import BaggingClassifier
clf = BaggingClassifier(base_estimator=LDA())
clf.fit(X_train, y_train)
我正在使用 AdaBoost 进行多 class class 化,将基础学习器作为判别式(线性或二次)。我在 scikit-learn 或任何其他库中找不到任何功能来实现这个,我该怎么做?
尽管 scikit-learn 的 AdaBoostClassifier
允许您选择一个基本估算器(参见 documentation), it requires the estimator to support sample_weight
. Take a look at the source:
if not has_fit_parameter(self.base_estimator_, "sample_weight"):
raise ValueError("%s doesn't support sample_weight."
% self.base_estimator_.__class__.__name__)
不幸的是,LinearDiscriminantAnalysis
和 QuadraticDiscriminantAnalysis
都不属于这一类。这是一个玩具示例:
from sklearn.ensemble import AdaBoostClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis as QDA
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target)
clf = AdaBoostClassifier(base_estimator=LDA())
clf.fit(X_train, y_train)
您会看到如下回溯:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 411, in fit
return super(AdaBoostClassifier, self).fit(X, y, sample_weight)
File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 128, in fit
self._validate_estimator()
File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 429, in _validate_estimator
% self.base_estimator_.__class__.__name__)
ValueError: LinearDiscriminantAnalysis doesn't support sample_weight.
这是一项您不会在 scikit-learn 中绕过的要求。文档清楚地表明这是一项硬性要求:
"...Support for sample weighting is required, as well as proper
classes_
andn_classes_
attributes."
但是,如果您只想使用集成,则始终可以使用 bagging 而不是 boosting:
from sklearn.ensemble import BaggingClassifier
clf = BaggingClassifier(base_estimator=LDA())
clf.fit(X_train, y_train)