为两个数组中的每个唯一项获取最常见的匹配项
Get most common match for each unique item in two arrays
我有类似于这两个数组的数据:
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class_____ = ['A','B','C','A','B','C','A','B','C']
我想找到在大多数人达成共识后正确预测的 classes 的数量 - 例如,我的数据显示 'A' = 66% 正确的预测,'B' = 66% 正确,'C' = 33% 正确,因此考虑到 class 'A' 和 'B' 的最常见预测是正确的,因此总体准确度为 66%,但是 'C' 不是。
根据您在示例和评论中所写的内容,您似乎正在寻找每个 class.
的 correct-to-all 预测比率的最大值
这是使用 collections.Counter
的一种方法:
import collections
def max_model_match(true, predicted):
# count all occurrences of the classes
counter_all = collections.Counter(true)
# initialize the "correct" or "good" counter
counter_good = counter_all.copy()
counter_good.clear()
# loop through all outcomes
for (x, y) in zip(true, predicted):
# if the prediction is correct increment the counter
if x == y:
counter_good[x] += 1
# find the maximum correct-to-all ratio
max_good_ratio = 0.0
for key in counter_all.keys():
good_ratio = counter_good[key] / counter_all[key]
if good_ratio > max_good_ratio:
max_good_ratio = good_ratio
return max_good_ratio
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class = ['A','B','C','A','B','C','A','B','C']
max_model_match(true_class, predicted_class)
# 0.6666666666666666
使用 defaultdict
和 max
的简单方法:
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class = ['A','B','C','A','B','C','A','B','C']
from collections import defaultdict
d = defaultdict(lambda : [0, 0]) # [total, correct]
for p,t in zip(predicted_class, true_class):
d[t][0] += 1
if p == t:
d[t][1] += 1
# max value
max(n/t for t,n in d.values())
输出:0.666666666
我有类似于这两个数组的数据:
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class_____ = ['A','B','C','A','B','C','A','B','C']
我想找到在大多数人达成共识后正确预测的 classes 的数量 - 例如,我的数据显示 'A' = 66% 正确的预测,'B' = 66% 正确,'C' = 33% 正确,因此考虑到 class 'A' 和 'B' 的最常见预测是正确的,因此总体准确度为 66%,但是 'C' 不是。
根据您在示例和评论中所写的内容,您似乎正在寻找每个 class.
的 correct-to-all 预测比率的最大值这是使用 collections.Counter
的一种方法:
import collections
def max_model_match(true, predicted):
# count all occurrences of the classes
counter_all = collections.Counter(true)
# initialize the "correct" or "good" counter
counter_good = counter_all.copy()
counter_good.clear()
# loop through all outcomes
for (x, y) in zip(true, predicted):
# if the prediction is correct increment the counter
if x == y:
counter_good[x] += 1
# find the maximum correct-to-all ratio
max_good_ratio = 0.0
for key in counter_all.keys():
good_ratio = counter_good[key] / counter_all[key]
if good_ratio > max_good_ratio:
max_good_ratio = good_ratio
return max_good_ratio
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class = ['A','B','C','A','B','C','A','B','C']
max_model_match(true_class, predicted_class)
# 0.6666666666666666
使用 defaultdict
和 max
的简单方法:
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class = ['A','B','C','A','B','C','A','B','C']
from collections import defaultdict
d = defaultdict(lambda : [0, 0]) # [total, correct]
for p,t in zip(predicted_class, true_class):
d[t][0] += 1
if p == t:
d[t][1] += 1
# max value
max(n/t for t,n in d.values())
输出:0.666666666