multi-class案例的混淆矩阵，所有评估指标的估计

Question

我正在做一些自我学习 ML.I 我正在尝试训练 K-NN 模型，我的模型给了我低于 7X7 的混淆 matrix.I 已经手动计算了两个 class(A 和 B) 但不确定我的方法是否正确，或者 not.I 如果确认我的结果，是否希望其他人在那里，以便我将编写一个程序来计算其他 5 classes.

Confusion Matrx : 

  A      B    C    D   E    F     G =  7 classes
[[ 238    1    2    0   41   11    0]
 [   0   25    0    0    3    1    0]
 [  21    1   32    0   17    4    0]
 [   0    0    0    7    9    3    0]
 [   7    0    0    0 3633    8    0]
 [  44    0    4    1  397  256    1]
 [   4    0    0    0    7    2    3]]


Class-A 
tp = 238
fp = 76  (Rest of the 'a' column e.g 21+0+7+44+4 = 76)
tn = 4414


    [25    0    0    3    1    0]
    [1    32    0   17    4    0]
    [0     0    7    9    3    0]
    [0     0    0 3633    8    0]
    [0     4    1  397  256    1]
    [0     0    0    7    2    3]

  Sum of all elements will be true negative = 4414

  fn = 55 (Rest of the 'a' rows element e.g 1+2+0+41+11+0 =55 )

  So class-A  confusion matrix will something look like below

  tp fp | 238 76
        |
  fn tn | 55 4414

  Class A Accuracy = tp+tn/tp+tn+fp+fn = 4652/4768 = 0.97

  Class B 
  tp = 25
  fp = 2
  tn = 4752

  [  238       2    0   41   11    0]
  [  21       32    0   17    4    0]
  [   0        0    7    9    3    0]
  [   7        0    0 3633    8    0]
  [  44        4    1  397  256    1]
  [   4        0    0    7    2    3]

  fn = 4

  So class-B  confusion matrix will something look like below

  tp fp | 2 76
        |
  fn tn | 4 4752

 Class B Accuracy = tp+tn/tp+tn+fp+fn = 4754/4834 = 0.98

我尝试在线搜索但没有找到任何超过 2X2 矩阵的计算。

Answer 1

对于多 class 情况，可以使用：

import numpy as np

cnf_matrix = np.array([[13,  0,  0],
                       [ 0, 10,  6],
                       [ 0,  0,  9]])

FP = cnf_matrix.sum(axis=0) - np.diag(cnf_matrix)  
FN = cnf_matrix.sum(axis=1) - np.diag(cnf_matrix)
TP = np.diag(cnf_matrix)
TN = cnf_matrix.sum() - (FP + FN + TP)

FP = FP.astype(float)
FN = FN.astype(float)
TP = TP.astype(float)
TN = TN.astype(float)


# Sensitivity, hit rate, recall, or true positive rate
TPR = TP/(TP+FN)
# Specificity or true negative rate
TNR = TN/(TN+FP) 
# Precision or positive predictive value
PPV = TP/(TP+FP)
# Negative predictive value
NPV = TN/(TN+FN)
# Fall out or false positive rate
FPR = FP/(FP+TN)
# False negative rate
FNR = FN/(TP+FN)
# False discovery rate
FDR = FP/(TP+FP)

# Overall accuracy
ACC = (TP+TN)/(TP+FP+FN+TN)

代码背后的想法：在下图中有很多 classes 的一般情况下，这些指标以图形方式表示。

multi-class案例的混淆矩阵，所有评估指标的估计

Confusion matrix for multi-class case, estimation of all evaluation metrics

machine-learning

confusion-matrix

scikit-learn