Python 中的数组 TP、TN、FP 和 FN

arrays TP, TN, FP and FN in Python

我的预测结果是这样的

测试数组

[1,0,0,0,1,0,1,...,1,0,1,1],
[1,0,1,0,0,1,0,...,0,1,1,1],
[0,1,1,1,1,1,0,...,0,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],

预测数组

[1,0,0,0,0,1,1,...,1,0,1,1],
[1,0,1,1,1,1,0,...,1,0,0,1],
[0,1,0,1,0,0,0,...,1,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],

这是我拥有的数组的大小

TestArray.shape

Out[159]: (200, 24)

PredictionArray.shape

Out[159]: (200, 24)

我想获得这些阵列的 TP、TN、FP 和 FN

我试过这个代码

cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)

但我得到的结果是

TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)

125 5 0 1

我检查了厘米的形状

cm.shape

Out[168]: (17, 17)

125 + 5 + 0 + 1 = 131 这不等于我的列数 200

我希望有 200 个,因为数组中的每个单元格都应该是 TF、TN、FP、TP,所以总数应该是 200

如何解决?

这是问题的一个例子

import numpy as np
from sklearn.metrics import confusion_matrix


TestArray = np.array(
[
[1,0,0,1,0,1,1,0,1,0,1,1,0,0,1,1,1,0,0,1],
[0,1,1,0,1,0,0,1,0,0,0,1,0,1,0,1,1,0,1,1],
[1,0,1,1,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,1,1,1],
[0,0,0,0,1,1,0,1,1,0,0,1,0,1,1,0,1,1,1,1],
[1,0,0,1,1,1,0,1,1,0,1,0,0,1,1,0,0,1,0,0],
[1,1,1,0,0,1,0,0,1,1,0,1,0,1,1,1,1,1,0,1],
[0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,1,0,0,1,1],
[1,0,1,0,0,0,0,1,0,1,0,1,0,0,0,0,1,0,1,0],
[1,1,0,1,1,1,1,0,1,0,1,0,1,1,1,1,0,1,0,0]
])

TestArray.shape



PredictionArray = np.array(
[
[0,0,0,1,1,1,1,0,0,0,1,0,0,0,1,0,1,0,1,1],
[0,1,0,0,1,0,1,1,0,0,0,1,1,0,0,1,1,0,0,1],
[1,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0],
[0,1,0,1,0,0,1,0,0,1,0,1,1,0,0,1,0,0,1,1],
[0,0,1,0,0,1,0,1,1,1,0,1,1,1,0,0,1,1,0,1],
[1,0,0,1,0,1,1,1,1,0,0,1,0,1,1,1,0,1,1,0],
[1,1,0,0,1,1,0,0,0,1,0,1,0,0,1,1,0,1,0,1],
[0,0,0,0,0,0,0,1,1,0,1,0,0,1,0,1,1,0,1,1],
[1,0,1,1,0,0,0,1,0,1,0,1,1,1,1,0,0,0,1,0],
[1,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,1,0,0]
])

PredictionArray.shape

cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]

print(TN,FN,TP,FP)

输出为

5 0 2 0 

= 5+0+2+0 = 7 !!

数组有20列10行

但是 cm 总计 7!!

当使用 np.argmax 时,您输入的矩阵 sklearn.metrics.confusion_matrix 不再是二进制的,因为 np.argmax returns 是第一个出现的最大值的索引。在这种情况下 axis=1.

当您的预测不是二进制时,您不会得到好的真阳性/命中、真阴性/正确拒绝等。

你应该会发现 sum(sum(cm)) 确实等于 200。


如果数组的每个索引代表一个单独的预测,即您正在尝试获得 TP/TN/FP/FN 总共 200 (10 * 20) 个预测,结果为 0或每个预测 1,然后您可以通过 展平 数组,然后将它们解析为 confusion_matrix 来获得 TP/TN/FP/FN。也就是说,你可以将 TestArrayPreditionArry 重塑为 (200,),例如:

cm = confusion_matrix(TestArray.reshape(-1), PredictionArray.reshape(-1))

TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]

print(TN, FN, TP, FP, '=', TN + FN + TP + FP)

哪个returns

74 28 73 25 = 200