序数逻辑回归：Intercept_returns[1]而不是[n]

Question

我是运行使用 mord (scikitlearn) 库的序数（即多项式）岭回归。

y 是包含从 1 到 19 的整数值的单个列。

X 由分装在 4 个桶中的 7 个数值变量组成，并伪装成最终的 28 个二进制变量。

import pandas as pd
import numpy as np    
from sklearn import metrics
from sklearn.model_selection import train_test_split
import mord

in_X, out_X, in_y, out_y = train_test_split(X, y,
                                            stratify=y,
                                            test_size=0.3,
                                            random_state=42)

mul_lr = mord.OrdinalRidge(alpha=1.0,
                           fit_intercept=True,
                           normalize=False,
                           copy_X=True,
                           max_iter=None,
                           tol=0.001,
                           solver='auto').fit(in_X, in_y)

mul_lr.coef_ returns [28 x 1] 数组但 mul_lr.intercept_ returns 单个值（而不是 19）。

知道我错过了什么吗？

Answer 1

如果您希望模型对所有 19 个类别进行预测，您需要先将标签 y 转换为一种热编码，然后再训练模型。

from sklearn.preprocessing import OneHotEncoder

y-=1 # range from 1 to 19 -> range from 0 to 18
enc = OneHotEncoder(n_values=19)
y = enc.fit_transform(y).toarray()
"""
train a model
"""

现在 mul_lr.intercept_.shape 应该是 (19,)。

序数逻辑回归：Intercept_returns[1]而不是[n]

Ordinal logistic regression: Intercept_ returns [1] instead of [n]

python

python-3.x

scikit-learn

logistic-regression