枚举按 Python 中的两个字段排序的列表
Enumerate a list sorted by two fields in Python
我有一个这样的数组:
字段 4 是 1,2,3 的平均值,字段 5 是 1,2,3 的最小值。
[['name0', 24, 19, 25, 22.67, 19],
['name1', 25, 19, 25, 23.0, 19],
['name2', 25, 19, 25, 23.0, 19],
['name3', 24, 22, 23, 23.0, 22],
['name4', 27, 19, 25, 23.67, 19],
['name5', 27, 19, 25, 23.67, 19],
['name6', 28, 19, 26, 24.33, 19],
['name7', 28, 19, 26, 24.33, 19],
['name8', 28, 19, 26, 24.33, 19],
['name9', 26, 22, 27, 25.0, 22],
['name10', 27, 23, 25, 25.0, 23],
['name11', 30, 19, 27, 25.33, 19],
['name12', 24, 31, 28, 27.67, 24],
['name13', 28, 27, 28, 27.67, 27],
['name14', 27, 29, 27, 27.67, 27],
['name15', 29, 26, 29, 28.0, 26],
['name16', 29, 26, 30, 28.33, 26],
['name17', 30, 31, 26, 29.0, 26],
['name18', 33, 27, 30, 30.0, 27],
['name19', 29, 31, 30, 30.0, 29],
['name20', 30, 36, 31, 32.33, 30],
['name21', 36, 30, 32, 32.67, 30],
['name22', 38, 33, 36, 35.67, 33],
['name23', 30, 27, 99, 52.0, 27],
['name24', 99, 27, 32, 52.67, 27],
['name25', 37, 99, 36, 57.33, 36]]
已按字段 4 排序,然后按字段 5 排序。
我希望枚举此列表,创建一种 "ranking" 或 "podium"。
enumerate() 不起作用,因为如您所见,某些字段与字段 4 和 5 相关联,因此它们的"rank"应该是一样的。
例如,第一个值应如下所示:
[['1', 'name0', 24, 19, 25, 22.67, 19],
['2', 'name1', 25, 19, 25, 23.0, 19],
['2', 'name2', 25, 19, 25, 23.0, 19],
['3', 'name3', 24, 22, 23, 23.0, 22],
['4', 'name4', 27, 19, 25, 23.67, 19],
...]
无法找到解决此问题的简洁方法。
感谢您的帮助。
从 i = 1
开始,遍历它们并分配排名,如果下一行不同,则仅递增 i += 1
。
假设列表已排序,您可以使用...恰当命名的 groupby
, and itemgetter
按第 4 和第 5 个元素对子列表进行分组。在 groupby
:
返回的迭代器上使用 enumerate
from itertools import groupby
from operator import itemgetter
# data = [['name0', ...
[ [str(i+1)] + l for i, (k, g) in enumerate(groupby(data, key=itemgetter(4, 5))) for l in g ]
输出:
[
['1', 'name0', 24, 19, 25, 22.67, 19],
['2', 'name1', 25, 19, 25, 23.0, 19],
['2', 'name2', 25, 19, 25, 23.0, 19],
['3', 'name3', 24, 22, 23, 23.0, 22],
['4', 'name4', 27, 19, 25, 23.67, 19],
['4', 'name5', 27, 19, 25, 23.67, 19],
['5', 'name6', 28, 19, 26, 24.33, 19],
['5', 'name7', 28, 19, 26, 24.33, 19],
['5', 'name8', 28, 19, 26, 24.33, 19],
['6', 'name9', 26, 22, 27, 25.0, 22],
['7', 'name10', 27, 23, 25, 25.0, 23],
['8', 'name11', 30, 19, 27, 25.33, 19],
['9', 'name12', 24, 31, 28, 27.67, 24],
['10', 'name13', 28, 27, 28, 27.67, 27],
['10', 'name14', 27, 29, 27, 27.67, 27],
['11', 'name15', 29, 26, 29, 28.0, 26],
['12', 'name16', 29, 26, 30, 28.33, 26],
['13', 'name17', 30, 31, 26, 29.0, 26],
['14', 'name18', 33, 27, 30, 30.0, 27],
['15', 'name19', 29, 31, 30, 30.0, 29],
['16', 'name20', 30, 36, 31, 32.33, 30],
['17', 'name21', 36, 30, 32, 32.67, 30],
['18', 'name22', 38, 33, 36, 35.67, 33],
['19', 'name23', 30, 27, 99, 52.0, 27],
['20', 'name24', 99, 27, 32, 52.67, 27],
['21', 'name25', 37, 99, 36, 57.33, 36]
]
使用 Pandas
和 dense rank
:
import pandas as pd
df = pd.DataFrame(data = [['name0', 24, 19, 25, 22.67, 19],
['name1', 25, 19, 25, 23.0, 19],
['name2', 25, 19, 25, 23.0, 19],
['name3', 24, 22, 23, 23.0, 22],
['name4', 27, 19, 25, 23.67, 19],
['name5', 27, 19, 25, 23.67, 19],
['name6', 28, 19, 26, 24.33, 19],
['name7', 28, 19, 26, 24.33, 19],
['name8', 28, 19, 26, 24.33, 19],
['name9', 26, 22, 27, 25.0, 22],
['name10', 27, 23, 25, 25.0, 23],
['name11', 30, 19, 27, 25.33, 19],
['name12', 24, 31, 28, 27.67, 24],
['name13', 28, 27, 28, 27.67, 27],
['name14', 27, 29, 27, 27.67, 27],
['name15', 29, 26, 29, 28.0, 26],
['name16', 29, 26, 30, 28.33, 26],
['name17', 30, 31, 26, 29.0, 26],
['name18', 33, 27, 30, 30.0, 27],
['name19', 29, 31, 30, 30.0, 29],
['name20', 30, 36, 31, 32.33, 30],
['name21', 36, 30, 32, 32.67, 30],
['name22', 38, 33, 36, 35.67, 33],
['name23', 30, 27, 99, 52.0, 27],
['name24', 99, 27, 32, 52.67, 27],
['name25', 37, 99, 36, 57.33, 36]], columns= ['1', '2', '3', '4', '5', '6'])
df["rank"] = df['5'].rank(method = "dense")
df
>
1 2 3 4 5 6 rank
0 name0 24 19 25 22.67 19 1.0
1 name1 25 19 25 23.00 19 2.0
2 name2 25 19 25 23.00 19 2.0
3 name3 24 22 23 23.00 22 2.0
4 name4 27 19 25 23.67 19 3.0
5 name5 27 19 25 23.67 19 3.0
6 name6 28 19 26 24.33 19 4.0
7 name7 28 19 26 24.33 19 4.0
8 name8 28 19 26 24.33 19 4.0
9 name9 26 22 27 25.00 22 5.0
10 name10 27 23 25 25.00 23 5.0
11 name11 30 19 27 25.33 19 6.0
12 name12 24 31 28 27.67 24 7.0
13 name13 28 27 28 27.67 27 7.0
14 name14 27 29 27 27.67 27 7.0
15 name15 29 26 29 28.00 26 8.0
16 name16 29 26 30 28.33 26 9.0
17 name17 30 31 26 29.00 26 10.0
18 name18 33 27 30 30.00 27 11.0
19 name19 29 31 30 30.00 29 11.0
20 name20 30 36 31 32.33 30 12.0
21 name21 36 30 32 32.67 30 13.0
22 name22 38 33 36 35.67 33 14.0
23 name23 30 27 99 52.00 27 15.0
24 name24 99 27 32 52.67 27 16.0
25 name25 37 99 36 57.33 36 17.0
如果你想要列表的列表 -
df = df.set_index('rank').reset_index()
df.values.tolist()
您可以在用 None
值填充其中一项后通过压缩列表自身来配对相邻项,这样您就可以遍历压缩对以比较关键字段,如果它们相同, 重复使用之前的排名:
for i, ((*_, prev_mean, prev_min), (*_, mean, _min)) in enumerate(zip([(None, None)] + l, l)):
l[i].insert(0, str(l[i - 1][0] if mean == prev_mean and _min == prev_min else i + 1))
假设您的列表列表存储为变量 l
,l
变为:
[['1', 'name0', 24, 19, 25, 22.67, 19],
['2', 'name1', 25, 19, 25, 23.0, 19],
['2', 'name2', 25, 19, 25, 23.0, 19],
['4', 'name3', 24, 22, 23, 23.0, 22],
['5', 'name4', 27, 19, 25, 23.67, 19],
['5', 'name5', 27, 19, 25, 23.67, 19],
['7', 'name6', 28, 19, 26, 24.33, 19],
['7', 'name7', 28, 19, 26, 24.33, 19],
['7', 'name8', 28, 19, 26, 24.33, 19],
['10', 'name9', 26, 22, 27, 25.0, 22],
['11', 'name10', 27, 23, 25, 25.0, 23],
['12', 'name11', 30, 19, 27, 25.33, 19],
['13', 'name12', 24, 31, 28, 27.67, 24],
['14', 'name13', 28, 27, 28, 27.67, 27],
['14', 'name14', 27, 29, 27, 27.67, 27],
['16', 'name15', 29, 26, 29, 28.0, 26],
['17', 'name16', 29, 26, 30, 28.33, 26],
['18', 'name17', 30, 31, 26, 29.0, 26],
['19', 'name18', 33, 27, 30, 30.0, 27],
['20', 'name19', 29, 31, 30, 30.0, 29],
['21', 'name20', 30, 36, 31, 32.33, 30],
['22', 'name21', 36, 30, 32, 32.67, 30],
['23', 'name22', 38, 33, 36, 35.67, 33],
['24', 'name23', 30, 27, 99, 52.0, 27],
['25', 'name24', 99, 27, 32, 52.67, 27],
['26', 'name25', 37, 99, 36, 57.33, 36]]
我有一个这样的数组:
字段 4 是 1,2,3 的平均值,字段 5 是 1,2,3 的最小值。
[['name0', 24, 19, 25, 22.67, 19],
['name1', 25, 19, 25, 23.0, 19],
['name2', 25, 19, 25, 23.0, 19],
['name3', 24, 22, 23, 23.0, 22],
['name4', 27, 19, 25, 23.67, 19],
['name5', 27, 19, 25, 23.67, 19],
['name6', 28, 19, 26, 24.33, 19],
['name7', 28, 19, 26, 24.33, 19],
['name8', 28, 19, 26, 24.33, 19],
['name9', 26, 22, 27, 25.0, 22],
['name10', 27, 23, 25, 25.0, 23],
['name11', 30, 19, 27, 25.33, 19],
['name12', 24, 31, 28, 27.67, 24],
['name13', 28, 27, 28, 27.67, 27],
['name14', 27, 29, 27, 27.67, 27],
['name15', 29, 26, 29, 28.0, 26],
['name16', 29, 26, 30, 28.33, 26],
['name17', 30, 31, 26, 29.0, 26],
['name18', 33, 27, 30, 30.0, 27],
['name19', 29, 31, 30, 30.0, 29],
['name20', 30, 36, 31, 32.33, 30],
['name21', 36, 30, 32, 32.67, 30],
['name22', 38, 33, 36, 35.67, 33],
['name23', 30, 27, 99, 52.0, 27],
['name24', 99, 27, 32, 52.67, 27],
['name25', 37, 99, 36, 57.33, 36]]
已按字段 4 排序,然后按字段 5 排序。
我希望枚举此列表,创建一种 "ranking" 或 "podium"。
enumerate() 不起作用,因为如您所见,某些字段与字段 4 和 5 相关联,因此它们的"rank"应该是一样的。
例如,第一个值应如下所示:
[['1', 'name0', 24, 19, 25, 22.67, 19],
['2', 'name1', 25, 19, 25, 23.0, 19],
['2', 'name2', 25, 19, 25, 23.0, 19],
['3', 'name3', 24, 22, 23, 23.0, 22],
['4', 'name4', 27, 19, 25, 23.67, 19],
...]
无法找到解决此问题的简洁方法。 感谢您的帮助。
从 i = 1
开始,遍历它们并分配排名,如果下一行不同,则仅递增 i += 1
。
假设列表已排序,您可以使用...恰当命名的 groupby
, and itemgetter
按第 4 和第 5 个元素对子列表进行分组。在 groupby
:
enumerate
from itertools import groupby
from operator import itemgetter
# data = [['name0', ...
[ [str(i+1)] + l for i, (k, g) in enumerate(groupby(data, key=itemgetter(4, 5))) for l in g ]
输出:
[
['1', 'name0', 24, 19, 25, 22.67, 19],
['2', 'name1', 25, 19, 25, 23.0, 19],
['2', 'name2', 25, 19, 25, 23.0, 19],
['3', 'name3', 24, 22, 23, 23.0, 22],
['4', 'name4', 27, 19, 25, 23.67, 19],
['4', 'name5', 27, 19, 25, 23.67, 19],
['5', 'name6', 28, 19, 26, 24.33, 19],
['5', 'name7', 28, 19, 26, 24.33, 19],
['5', 'name8', 28, 19, 26, 24.33, 19],
['6', 'name9', 26, 22, 27, 25.0, 22],
['7', 'name10', 27, 23, 25, 25.0, 23],
['8', 'name11', 30, 19, 27, 25.33, 19],
['9', 'name12', 24, 31, 28, 27.67, 24],
['10', 'name13', 28, 27, 28, 27.67, 27],
['10', 'name14', 27, 29, 27, 27.67, 27],
['11', 'name15', 29, 26, 29, 28.0, 26],
['12', 'name16', 29, 26, 30, 28.33, 26],
['13', 'name17', 30, 31, 26, 29.0, 26],
['14', 'name18', 33, 27, 30, 30.0, 27],
['15', 'name19', 29, 31, 30, 30.0, 29],
['16', 'name20', 30, 36, 31, 32.33, 30],
['17', 'name21', 36, 30, 32, 32.67, 30],
['18', 'name22', 38, 33, 36, 35.67, 33],
['19', 'name23', 30, 27, 99, 52.0, 27],
['20', 'name24', 99, 27, 32, 52.67, 27],
['21', 'name25', 37, 99, 36, 57.33, 36]
]
使用 Pandas
和 dense rank
:
import pandas as pd
df = pd.DataFrame(data = [['name0', 24, 19, 25, 22.67, 19],
['name1', 25, 19, 25, 23.0, 19],
['name2', 25, 19, 25, 23.0, 19],
['name3', 24, 22, 23, 23.0, 22],
['name4', 27, 19, 25, 23.67, 19],
['name5', 27, 19, 25, 23.67, 19],
['name6', 28, 19, 26, 24.33, 19],
['name7', 28, 19, 26, 24.33, 19],
['name8', 28, 19, 26, 24.33, 19],
['name9', 26, 22, 27, 25.0, 22],
['name10', 27, 23, 25, 25.0, 23],
['name11', 30, 19, 27, 25.33, 19],
['name12', 24, 31, 28, 27.67, 24],
['name13', 28, 27, 28, 27.67, 27],
['name14', 27, 29, 27, 27.67, 27],
['name15', 29, 26, 29, 28.0, 26],
['name16', 29, 26, 30, 28.33, 26],
['name17', 30, 31, 26, 29.0, 26],
['name18', 33, 27, 30, 30.0, 27],
['name19', 29, 31, 30, 30.0, 29],
['name20', 30, 36, 31, 32.33, 30],
['name21', 36, 30, 32, 32.67, 30],
['name22', 38, 33, 36, 35.67, 33],
['name23', 30, 27, 99, 52.0, 27],
['name24', 99, 27, 32, 52.67, 27],
['name25', 37, 99, 36, 57.33, 36]], columns= ['1', '2', '3', '4', '5', '6'])
df["rank"] = df['5'].rank(method = "dense")
df
>
1 2 3 4 5 6 rank
0 name0 24 19 25 22.67 19 1.0
1 name1 25 19 25 23.00 19 2.0
2 name2 25 19 25 23.00 19 2.0
3 name3 24 22 23 23.00 22 2.0
4 name4 27 19 25 23.67 19 3.0
5 name5 27 19 25 23.67 19 3.0
6 name6 28 19 26 24.33 19 4.0
7 name7 28 19 26 24.33 19 4.0
8 name8 28 19 26 24.33 19 4.0
9 name9 26 22 27 25.00 22 5.0
10 name10 27 23 25 25.00 23 5.0
11 name11 30 19 27 25.33 19 6.0
12 name12 24 31 28 27.67 24 7.0
13 name13 28 27 28 27.67 27 7.0
14 name14 27 29 27 27.67 27 7.0
15 name15 29 26 29 28.00 26 8.0
16 name16 29 26 30 28.33 26 9.0
17 name17 30 31 26 29.00 26 10.0
18 name18 33 27 30 30.00 27 11.0
19 name19 29 31 30 30.00 29 11.0
20 name20 30 36 31 32.33 30 12.0
21 name21 36 30 32 32.67 30 13.0
22 name22 38 33 36 35.67 33 14.0
23 name23 30 27 99 52.00 27 15.0
24 name24 99 27 32 52.67 27 16.0
25 name25 37 99 36 57.33 36 17.0
如果你想要列表的列表 -
df = df.set_index('rank').reset_index()
df.values.tolist()
您可以在用 None
值填充其中一项后通过压缩列表自身来配对相邻项,这样您就可以遍历压缩对以比较关键字段,如果它们相同, 重复使用之前的排名:
for i, ((*_, prev_mean, prev_min), (*_, mean, _min)) in enumerate(zip([(None, None)] + l, l)):
l[i].insert(0, str(l[i - 1][0] if mean == prev_mean and _min == prev_min else i + 1))
假设您的列表列表存储为变量 l
,l
变为:
[['1', 'name0', 24, 19, 25, 22.67, 19],
['2', 'name1', 25, 19, 25, 23.0, 19],
['2', 'name2', 25, 19, 25, 23.0, 19],
['4', 'name3', 24, 22, 23, 23.0, 22],
['5', 'name4', 27, 19, 25, 23.67, 19],
['5', 'name5', 27, 19, 25, 23.67, 19],
['7', 'name6', 28, 19, 26, 24.33, 19],
['7', 'name7', 28, 19, 26, 24.33, 19],
['7', 'name8', 28, 19, 26, 24.33, 19],
['10', 'name9', 26, 22, 27, 25.0, 22],
['11', 'name10', 27, 23, 25, 25.0, 23],
['12', 'name11', 30, 19, 27, 25.33, 19],
['13', 'name12', 24, 31, 28, 27.67, 24],
['14', 'name13', 28, 27, 28, 27.67, 27],
['14', 'name14', 27, 29, 27, 27.67, 27],
['16', 'name15', 29, 26, 29, 28.0, 26],
['17', 'name16', 29, 26, 30, 28.33, 26],
['18', 'name17', 30, 31, 26, 29.0, 26],
['19', 'name18', 33, 27, 30, 30.0, 27],
['20', 'name19', 29, 31, 30, 30.0, 29],
['21', 'name20', 30, 36, 31, 32.33, 30],
['22', 'name21', 36, 30, 32, 32.67, 30],
['23', 'name22', 38, 33, 36, 35.67, 33],
['24', 'name23', 30, 27, 99, 52.0, 27],
['25', 'name24', 99, 27, 32, 52.67, 27],
['26', 'name25', 37, 99, 36, 57.33, 36]]