在 python 中对文本文件中的排序数据项进行分组和计算

Grouping and computing sorted data items in a text file in python

下面的代码片段对来自 data.txt 的数据进行了排序,我一直在尝试通过基于组对相同数据进行分组并执行相同数据组的添加来改进脚本。

import re
data = []
with open('data.txt') as f:
    for line in f:
        group, score, team = re.split(r' (-?\d*\.?\d+) ', line.strip('\n').strip())
        data.append((int(score), group.strip(), team.strip()))

data.sort(reverse=True)
print("Top Scores:")
for (score, group, team), _ in zip(data, range(100)):
    print(f'{group} - {score} - {team}')  

源文件(data.txt):

alpha 1 dream team
bravo 3 never mind us
charlie 1 diehard  
delta 2 just cool
echo 5 dont do it
falcon 3 your team
lima 6 allofme
charlie 10 diehard
romeo 12 justnow
echo 8 dont do it

当前输出:

Top Scores:
romeo - 12 - justnow
charlie - 10 - diehard
echo - 8 - dont do it
lima - 6 - allofme
echo - 5 - dont do it
falcon - 3 - your team
bravo - 3 - never mind us
delta - 2 - just cool
charlie - 1 - diehard
alpha - 1 - dream team

想要的输出:#-- 分组和总计

echo 13 dont do it   #-- totalled since repeating
romeo 12 justnow
charlie 11 diehard   #-- totalled since repeating
lima 6 allofme
bravo 3 never mind us
falcon 3 your team
delta 2 just cool
alpha 1 dream team

使用字典进行分组(在本例中为 defaultdict),然后恢复为 list 进行排序(顺便说一句,像这样的简单拆分不需要正则表达式一):

data=defaultdict(int)
with open('data.txt') as f:
    for line in f:
        group, score, team = line.split(maxsplit=2)
        data[(group.strip(),team.replace('\n','').strip())]+=int(score)
sorteddata = sorted([[k[0],v,k[1]] for k,v in data.items()], key=lambda x:x[1], reverse=True)

>>> sorteddata
[['echo', 13, 'dont do it'], ['romeo', 12, 'justnow'], ['charlie', 11, 'diehard'], ['lima', 6, 'allofme'], ['bravo', 3, 'never mind us'], ['falcon', 3, 'your team'], ['delta', 2, 'just cool'], ['alpha', 1, 'dream team']]