scikit-learn 神经网络初学者 - 结果不是我所期望的

scikit-learn neural net beginner - results not what I expect

我有一个简单的示例,我正在尝试使用 MLPClassifier 对其执行分类。

from sklearn.neural_network import MLPClassifier

# What are the features in our X data?
#  0. do .X files exist?
#  1. do .Y files exist?
#  2. does a Z.Z file exist?
# values are 0 for false and 1 for true

training_x = (
    [0,1,0],  # pure .Y files, no Z.Z
    [1,0,1],  # .X files and Z.Z
    [1,0,0],  # .X + w/o Z.Z
)
training_y = ('.Y, no .X, no Z.Z', '.X + Z.Z', '.X w/o Z.Z')
clf = MLPClassifier(solver='lbfgs', alpha=1e-5,
                    hidden_layer_sizes=(len(training_x)+1, len(training_x)+1), random_state=1)
# training
clf.fit(training_x, training_y)
# predictions
for i in (0,1):
    for j in (0,1):
        for k in (0,1):
            results = list(clf.predict_proba([[i, j, k]])[0])
            # seems they are reversed:
            results.reverse()
            discrete_results = None
            for index in range(len(training_x)):
                if results[index] > 0.999:
                    if discrete_results is not None:
                        print('hold on a minute')
                    discrete_results = training_y[index]
            print(f'{i},{j},{k} ==> {results}, discrete={discrete_results}')

当我使用所有可能的(离散的)输入对其进行测试时,我希望对输入案例进行预测:[0,1,0]、[1,0,1] 和 [1,0,0 ] 我会看到与我的三个 training_y 个案例非常匹配,而对于其他输入案例,结果将定义不足且不感兴趣。但是,这三个输入案例根本不匹配,除非我反转 [0,1,0] 输入确实匹配并且其他两个被交换的 proba 结果。这是包含反向的输出:

0,0,0 ==> [1.1527971240749179e-19, 0.0029561479916546647, 0.9970438520083453], discrete=None
0,0,1 ==> [0.9999549772644907, 3.686866933257315e-08, 4.498586684013346e-05], discrete=.Y, no .X, no Z.Z
0,1,0 ==> [0.9999549772644907, 3.686866933257315e-08, 4.498586684013346e-05], discrete=.Y, no .X, no Z.Z
0,1,1 ==> [0.9999549772644907, 3.686866933257315e-08, 4.498586684013346e-05], discrete=.Y, no .X, no Z.Z
1,0,0 ==> [4.971668615064256e-68, 0.9999999980156198, 1.9843802638506693e-09], discrete=.X + Z.Z
1,0,1 ==> [1.3622448606166547e-05, 3.911037287197552e-05, 0.9999472671785217], discrete=.X w/o Z.Z
1,1,0 ==> [3.09415772026147e-33, 0.934313523906787, 0.06568647609321301], discrete=None
1,1,1 ==> [0.9999549772644907, 3.686866933257315e-08, 4.498586684013346e-05], discrete=.Y, no .X, no Z.Z

毫无疑问,我犯了一个愚蠢的初学者错误!帮助找到它,我们将不胜感激。

predict_proba 中的概率顺序没有“颠倒”,它们按(大概)字母顺序存储;您可以检查属性 classes_ 中的顺序。而不是在阈值 0.999 处离散化自己,考虑调用 predict,这将以最大的概率获取 class,但更重要的是在内部转换回 class 的文本。