混淆矩阵 - 当两个 true/predicted 列都有值时的 TP

Question

我需要创建一个混淆矩阵如下：

Truth - any value & Predicted - any value : True Positive
Truth - NaN       & Predicted - NaN       : True Negative
Truth - any value & Predicted - NaN       : False Negative
Truth - NaN       & Predicted - any value : False Positive

这与典型的混淆矩阵计算不同，因为我没有要比较的标签。在 Python 中是否有一种简单的方法来执行此操作（即使这意味着手动计算 TP/TN/FP/FN 值）？

提前致谢！

Answer 1

我认为如果您将标签定义为：

True：“我的号码不是 NaN”
错误：“我的号码是 NaN”

那么你的设置原来是一个简单的二分类任务，你可以毫无问题地使用混淆矩阵。

import numpy as np 
from sklearn.metrics import confusion_matrix

pred  = [np.nan,    234,  1, 0, np.nan, -23,    3.2]
truth = [np.nan, np.nan, 21, 1,      0,  21, np.nan]

# Convert your predictions and true values into labels
pred_labels = ~np.isnan(pred)  
truth_labels = ~np.isnan(truth)

print(pred_labels)
# [False  True  True  True False  True  True]

print(truth_labels)
# [False False  True  True  True  True False]

print(confusion_matrix(pred_labels, truth_labels))
# [[1 1]
#  [2 3]]

如果您想检查 TP、TN、FP、FN：

(tn, fp, fn, tp) = confusion_matrix(pred_labels, truth_labels).ravel()
print(tp)  # 3
print(tn)  # 1
print(fp)  # 1
print(fn)  # 2

注意：如果您想避免无穷大值 (np.inf)，也可以使用 ~np.isfinite(x) 而不是 np.isna(x)。

混淆矩阵 - 当两个 true/predicted 列都有值时的 TP

Confusion Matrix - TP when both true/predicted columns have a value

python-3.x

confusion-matrix