sklearn 的 MLP predict_proba 函数在内部是如何工作的?

How does sklearn's MLP predict_proba function work internally?

我想了解 sklearn's MLP Classifier 如何检索其 predict_proba 函数的结果。

该网站仅列出:

Probability estimates

而许多其他人,例如 logistic regression,有更详细的答案: 概率估计。

所有 类 的返回估计值按 类 的标签排序。

For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.

其他模型类型也有更多细节。以 support vector machine classifier

为例

还有this very nice Stack Overflow post深入解释了它。

Compute probabilities of possible outcomes for samples in X.

The model need to have probability information computed at training time: fit with attribute probability set to True.

其他例子

Random Forest:

Predict class probabilities for X.

The predicted class probabilities of an input sample are computed as the mean predicted class probabilities of the trees in the forest. The class probability of a single tree is the fraction of samples of the same class in a leaf.

Gaussian Process Classifier:

我希望了解与上述内容相同的内容 post,但对于 MLPClassifierMLPClassifier 内部是如何工作的?

source code 中查找,我发现:

def _initialize(self, y, layer_units):

    # set all attributes, allocate weights etc for first call
    # Initialize parameters
    self.n_iter_ = 0
    self.t_ = 0
    self.n_outputs_ = y.shape[1]

    # Compute the number of layers
    self.n_layers_ = len(layer_units)

    # Output for regression
    if not is_classifier(self):
        self.out_activation_ = 'identity'
    # Output for multi class
    elif self._label_binarizer.y_type_ == 'multiclass':
        self.out_activation_ = 'softmax'
    # Output for binary class and multi-label
    else:
        self.out_activation_ = 'logistic'

似乎 MLP 分类器使用 logistic 函数进行二元分类,使用 softmax 函数进行多标签分类以构建输出层。这表明网络的输出是一个概率向量,网络基于该向量推导出预测。

如果我查看 predict_proba 方法:

def predict_proba(self, X):
    """Probability estimates.
    Parameters
    ----------
    X : {array-like, sparse matrix} of shape (n_samples, n_features)
        The input data.
    Returns
    -------
    y_prob : ndarray of shape (n_samples, n_classes)
        The predicted probability of the sample for each class in the
        model, where classes are ordered as they are in `self.classes_`.
    """
    check_is_fitted(self)
    y_pred = self._predict(X)

    if self.n_outputs_ == 1:
        y_pred = y_pred.ravel()

    if y_pred.ndim == 1:
        return np.vstack([1 - y_pred, y_pred]).T
    else:
        return y_pred

这证实了 softmax 或 logistic 作为输出层的激活函数以获得概率向量的作用。

希望对您有所帮助。