检索元素在各种列表中的排名,计算其排名分数的加权平均值 Python
Retrieve the ranking of elements in various list to compute the weighted average of their ranking scores Python
我有两个排序的字典,即它们现在表示为列表。
我想检索每个列表中每个元素的排名位置并将其存储在一个变量中,以便最终我可以计算两个列表中每个元素的排名分数的加权平均值。这是一个例子。
dict1 = {'class1': 15.17, 'class2': 15.95, 'class3': 15.95}
sorted_dict1 = [('class1', 15.17), ('class2', 15.95), ('class3', 15.95)]
sorted_dict2 = [('class2', 9.10), ('class3', 9.22), ('class1', 10.60)]
到目前为止,我可以检索列表中每个元素的排名位置并打印排名,但是当我尝试计算排名分数的加权平均值时,即 [(w1*a + w2*b)/(w1 +w2)],其中"a"是在sorted_dict1中的排名位置,"b"是在sorted_dict2中的排名位置,我得到的数字不是正确的加权平均数.
尝试了各种方法,这里是一个:
for idx, val in list(enumerate(sorted_dict1, 1)):
for idx1, val1 in list(enumerate(sorted_dict2, 1)):
position_dict1 = idx
position_dict2 = idx1
weighted_average = float((0.50*position_dict1 + 0.25*position_dict2))/0.75
print weighted_average
我也没有考虑如果两个 类 在列表中排名相同会发生什么。我也很高兴能得到任何 hints/help。
我认为我可能需要创建一个函数来解决这个问题,但我也没有走得太远。
如果能提供任何帮助以及解释代码的注释,那就太好了。
所以我想计算列表中元素排名位置的加权平均值。例如的加权平均值:
class1:
weighted_average = ((0.50 * 1) + (0.25 * 3))/0.75 = 1.5
class2:
然后 weighted_average = ((0.50 *2)+(0.25*1))/0.75 = 1.6666..7
谢谢!
我采取了简单的方法,给了等分的 类 下一个整数排名,所以 class3
和 class2
都在 sorted_dict1
[ 中排名第 2 =28=]
#!/usr/bin/env python
#Get the ranks for a list of (class, score) tuples sorted by score
#and return them in a dict
def get_ranks(sd):
#The first class in the list has rank 1
k, val = sd[0]
r = 1
rank = {k: r}
for k, v in sd[1:]:
#Only update the rank number if this value is
#greater than the previous
if v > val:
val = v
r += 1
rank[k] = r
return rank
def weighted_mean(a, b):
return (0.50*a + 0.25*b) / 0.75
sorted_dict1 = [('class1', 15.17), ('class2', 15.95), ('class3', 15.95)]
sorted_dict2 = [('class2', 9.10), ('class3', 9.22), ('class1', 10.60)]
print sorted_dict1
print sorted_dict2
ranks1 = get_ranks(sorted_dict1)
ranks2 = get_ranks(sorted_dict2)
print ranks1
print ranks2
keys = sorted(k for k,v in sorted_dict1)
print [(k, weighted_mean(ranks1[k], ranks2[k])) for k in keys]
输出
[('class1', 15.17), ('class2', 15.949999999999999), ('class3', 15.949999999999999)]
[('class2', 9.0999999999999996), ('class3', 9.2200000000000006), ('class1', 10.6)]
{'class2': 2, 'class3': 2, 'class1': 1}
{'class2': 1, 'class3': 2, 'class1': 3}
[('class1', 1.6666666666666667), ('class2', 1.6666666666666667), ('class3', 2.0)]
我在评论中提到有一种创建具有自定义权重的 weighted_mean()
函数的好方法。当然,我们 可以 将权重作为附加参数传递给 weighted_mean()
,但这使得对 weighted_mean()
的调用比需要的更混乱,使得程序更难阅读。
诀窍是使用将自定义权重作为参数的函数和 returns 所需的函数。从技术上讲,这样的函数制作函数称为 closure.
这是一个关于如何做到这一点的简短演示。
#!/usr/bin/env python
#Create a weighted mean function with weights w1 & w2
def make_weighted_mean(w1, w2):
wt = float(w1 + w2)
def wm(a, b):
return (w1 * a + w2 * b) / wt
return wm
#Make the weighted mean function
weighted_mean = make_weighted_mean(1, 2)
#Test
print weighted_mean(6, 3)
print weighted_mean(3, 9)
输出
4.0
7.0
这是上面第一个程序的更新版本,可以处理任意数量的 sorted_dict 列表。它使用原始的 get_ranks()
函数,但它使用比上述示例稍微复杂的闭包来对数据列表(或元组)进行加权处理。
#!/usr/bin/env python
''' Weighted means of ranks
From
Written by PM 2Ring 2015.04.03
'''
from pprint import pprint
#Create a weighted mean function with weights from list/tuple weights
def make_weighted_mean(weights):
wt = float(sum(weights))
#A function that calculates the weighted mean of values in seq
#weighted by the weights passed to make_weighted_mean()
def wm(seq):
return sum(w * v for w, v in zip(weights, seq)) / wt
return wm
#Get the ranks for a list of (class, score) tuples sorted by score
#and return them in a dict
def get_ranks(sd):
#The first class in the list has rank 1
k, val = sd[0]
r = 1
rank = {k: r}
for k, v in sd[1:]:
#Only update the rank number if this value is
#greater than the previous
if v > val:
val = v
r += 1
rank[k] = r
return rank
#Make the weighted mean function
weights = [0.50, 0.25]
weighted_mean = make_weighted_mean(weights)
#Some test data
sorted_dicts = [
[('class1', 15.17), ('class2', 15.95), ('class3', 15.95), ('class4', 16.0)],
[('class2', 9.10), ('class3', 9.22), ('class1', 10.60), ('class4', 11.0)]
]
print 'Sorted dicts:'
pprint(sorted_dicts, indent=4)
all_ranks = [get_ranks(sd) for sd in sorted_dicts]
print '\nAll ranks:'
pprint(all_ranks, indent=4)
#Get a sorted list of the keys
keys = sorted(k for k,v in sorted_dicts[0])
#print '\nKeys:', keys
means = [(k, weighted_mean([ranks[k] for ranks in all_ranks])) for k in keys]
print '\nWeighted means:'
pprint(means, indent=4)
输出
Sorted dicts:
[ [ ('class1', 15.17),
('class2', 15.949999999999999),
('class3', 15.949999999999999),
('class4', 16.0)],
[ ('class2', 9.0999999999999996),
('class3', 9.2200000000000006),
('class1', 10.6),
('class4', 11.0)]]
All ranks:
[ { 'class1': 1, 'class2': 2, 'class3': 2, 'class4': 3},
{ 'class1': 3, 'class2': 1, 'class3': 2, 'class4': 4}]
Weighted means:
[ ('class1', 1.6666666666666667),
('class2', 1.6666666666666667),
('class3', 2.0),
('class4', 3.3333333333333335)]
这是 get_ranks()
的替代版本,如果两个或更多 类 在列表中排名相同,则跳过排名数字
def get_ranks(sd):
#The first class in the list has rank 1
k, val = sd[0]
r = 1
rank = {k: r}
#The step size from one rank to the next. Normally
#delta is 1, but it's increased if there are ties.
delta = 1
for k, v in sd[1:]:
#Update the rank number if this value is
#greater than the previous.
if v > val:
val = v
r += delta
delta = 1
#Otherwise, update delta
else:
delta += 1
rank[k] = r
return rank
这是使用替代版本 get_ranks()
的程序的输出:
Sorted dicts:
[ [ ('class1', 15.17),
('class2', 15.949999999999999),
('class3', 15.949999999999999),
('class4', 16.0)],
[ ('class2', 9.0999999999999996),
('class3', 9.2200000000000006),
('class1', 10.6),
('class4', 11.0)]]
All ranks:
[ { 'class1': 1, 'class2': 2, 'class3': 2, 'class4': 4},
{ 'class1': 3, 'class2': 1, 'class3': 2, 'class4': 4}]
Weighted means:
[ ('class1', 1.6666666666666667),
('class2', 1.6666666666666667),
('class3', 2.0),
('class4', 4.0)]
我有两个排序的字典,即它们现在表示为列表。 我想检索每个列表中每个元素的排名位置并将其存储在一个变量中,以便最终我可以计算两个列表中每个元素的排名分数的加权平均值。这是一个例子。
dict1 = {'class1': 15.17, 'class2': 15.95, 'class3': 15.95}
sorted_dict1 = [('class1', 15.17), ('class2', 15.95), ('class3', 15.95)]
sorted_dict2 = [('class2', 9.10), ('class3', 9.22), ('class1', 10.60)]
到目前为止,我可以检索列表中每个元素的排名位置并打印排名,但是当我尝试计算排名分数的加权平均值时,即 [(w1*a + w2*b)/(w1 +w2)],其中"a"是在sorted_dict1中的排名位置,"b"是在sorted_dict2中的排名位置,我得到的数字不是正确的加权平均数.
尝试了各种方法,这里是一个:
for idx, val in list(enumerate(sorted_dict1, 1)):
for idx1, val1 in list(enumerate(sorted_dict2, 1)):
position_dict1 = idx
position_dict2 = idx1
weighted_average = float((0.50*position_dict1 + 0.25*position_dict2))/0.75
print weighted_average
我也没有考虑如果两个 类 在列表中排名相同会发生什么。我也很高兴能得到任何 hints/help。
我认为我可能需要创建一个函数来解决这个问题,但我也没有走得太远。
如果能提供任何帮助以及解释代码的注释,那就太好了。
所以我想计算列表中元素排名位置的加权平均值。例如的加权平均值:
class1: weighted_average = ((0.50 * 1) + (0.25 * 3))/0.75 = 1.5
class2: 然后 weighted_average = ((0.50 *2)+(0.25*1))/0.75 = 1.6666..7
谢谢!
我采取了简单的方法,给了等分的 类 下一个整数排名,所以 class3
和 class2
都在 sorted_dict1
[ 中排名第 2 =28=]
#!/usr/bin/env python
#Get the ranks for a list of (class, score) tuples sorted by score
#and return them in a dict
def get_ranks(sd):
#The first class in the list has rank 1
k, val = sd[0]
r = 1
rank = {k: r}
for k, v in sd[1:]:
#Only update the rank number if this value is
#greater than the previous
if v > val:
val = v
r += 1
rank[k] = r
return rank
def weighted_mean(a, b):
return (0.50*a + 0.25*b) / 0.75
sorted_dict1 = [('class1', 15.17), ('class2', 15.95), ('class3', 15.95)]
sorted_dict2 = [('class2', 9.10), ('class3', 9.22), ('class1', 10.60)]
print sorted_dict1
print sorted_dict2
ranks1 = get_ranks(sorted_dict1)
ranks2 = get_ranks(sorted_dict2)
print ranks1
print ranks2
keys = sorted(k for k,v in sorted_dict1)
print [(k, weighted_mean(ranks1[k], ranks2[k])) for k in keys]
输出
[('class1', 15.17), ('class2', 15.949999999999999), ('class3', 15.949999999999999)]
[('class2', 9.0999999999999996), ('class3', 9.2200000000000006), ('class1', 10.6)]
{'class2': 2, 'class3': 2, 'class1': 1}
{'class2': 1, 'class3': 2, 'class1': 3}
[('class1', 1.6666666666666667), ('class2', 1.6666666666666667), ('class3', 2.0)]
我在评论中提到有一种创建具有自定义权重的 weighted_mean()
函数的好方法。当然,我们 可以 将权重作为附加参数传递给 weighted_mean()
,但这使得对 weighted_mean()
的调用比需要的更混乱,使得程序更难阅读。
诀窍是使用将自定义权重作为参数的函数和 returns 所需的函数。从技术上讲,这样的函数制作函数称为 closure.
这是一个关于如何做到这一点的简短演示。
#!/usr/bin/env python
#Create a weighted mean function with weights w1 & w2
def make_weighted_mean(w1, w2):
wt = float(w1 + w2)
def wm(a, b):
return (w1 * a + w2 * b) / wt
return wm
#Make the weighted mean function
weighted_mean = make_weighted_mean(1, 2)
#Test
print weighted_mean(6, 3)
print weighted_mean(3, 9)
输出
4.0
7.0
这是上面第一个程序的更新版本,可以处理任意数量的 sorted_dict 列表。它使用原始的 get_ranks()
函数,但它使用比上述示例稍微复杂的闭包来对数据列表(或元组)进行加权处理。
#!/usr/bin/env python
''' Weighted means of ranks
From
Written by PM 2Ring 2015.04.03
'''
from pprint import pprint
#Create a weighted mean function with weights from list/tuple weights
def make_weighted_mean(weights):
wt = float(sum(weights))
#A function that calculates the weighted mean of values in seq
#weighted by the weights passed to make_weighted_mean()
def wm(seq):
return sum(w * v for w, v in zip(weights, seq)) / wt
return wm
#Get the ranks for a list of (class, score) tuples sorted by score
#and return them in a dict
def get_ranks(sd):
#The first class in the list has rank 1
k, val = sd[0]
r = 1
rank = {k: r}
for k, v in sd[1:]:
#Only update the rank number if this value is
#greater than the previous
if v > val:
val = v
r += 1
rank[k] = r
return rank
#Make the weighted mean function
weights = [0.50, 0.25]
weighted_mean = make_weighted_mean(weights)
#Some test data
sorted_dicts = [
[('class1', 15.17), ('class2', 15.95), ('class3', 15.95), ('class4', 16.0)],
[('class2', 9.10), ('class3', 9.22), ('class1', 10.60), ('class4', 11.0)]
]
print 'Sorted dicts:'
pprint(sorted_dicts, indent=4)
all_ranks = [get_ranks(sd) for sd in sorted_dicts]
print '\nAll ranks:'
pprint(all_ranks, indent=4)
#Get a sorted list of the keys
keys = sorted(k for k,v in sorted_dicts[0])
#print '\nKeys:', keys
means = [(k, weighted_mean([ranks[k] for ranks in all_ranks])) for k in keys]
print '\nWeighted means:'
pprint(means, indent=4)
输出
Sorted dicts:
[ [ ('class1', 15.17),
('class2', 15.949999999999999),
('class3', 15.949999999999999),
('class4', 16.0)],
[ ('class2', 9.0999999999999996),
('class3', 9.2200000000000006),
('class1', 10.6),
('class4', 11.0)]]
All ranks:
[ { 'class1': 1, 'class2': 2, 'class3': 2, 'class4': 3},
{ 'class1': 3, 'class2': 1, 'class3': 2, 'class4': 4}]
Weighted means:
[ ('class1', 1.6666666666666667),
('class2', 1.6666666666666667),
('class3', 2.0),
('class4', 3.3333333333333335)]
这是 get_ranks()
的替代版本,如果两个或更多 类 在列表中排名相同,则跳过排名数字
def get_ranks(sd):
#The first class in the list has rank 1
k, val = sd[0]
r = 1
rank = {k: r}
#The step size from one rank to the next. Normally
#delta is 1, but it's increased if there are ties.
delta = 1
for k, v in sd[1:]:
#Update the rank number if this value is
#greater than the previous.
if v > val:
val = v
r += delta
delta = 1
#Otherwise, update delta
else:
delta += 1
rank[k] = r
return rank
这是使用替代版本 get_ranks()
的程序的输出:
Sorted dicts:
[ [ ('class1', 15.17),
('class2', 15.949999999999999),
('class3', 15.949999999999999),
('class4', 16.0)],
[ ('class2', 9.0999999999999996),
('class3', 9.2200000000000006),
('class1', 10.6),
('class4', 11.0)]]
All ranks:
[ { 'class1': 1, 'class2': 2, 'class3': 2, 'class4': 4},
{ 'class1': 3, 'class2': 1, 'class3': 2, 'class4': 4}]
Weighted means:
[ ('class1', 1.6666666666666667),
('class2', 1.6666666666666667),
('class3', 2.0),
('class4', 4.0)]