我需要找到 n 个最好的学生,每个国家的 m 个最好的学生将组成他们的国家组

I need to find n best students overall and m best students of each country will group together to form their national group

我获得了 32 个最好的分数。现在我正在尝试获取 32 个最好的学生的索引,以便我可以显示他们是谁。 我的 json 文件的 link 在这里:

https://drive.google.com/file/d/1OOkX1hAD6Ot-I3h_DUM2gRqdSl5Hy2Pl/view

代码如下:

import json


file_path = "C:/Users/User/Desktop/leksion 10/testim/u2/olympiad.json"

with open(file_path, 'r') as j:

    contents = json.loads(j.read())
    print(contents)

print("\n================================================")

class Competitor:
    def __init__(self, first_name, last_name, country, point):
        self.first_name = first_name
        self.last_name = last_name
        self.country= country
        self.point = int(point)
    def __repr__(self):
        return f'{self.first_name} {self.last_name} {self.country} {self.point}'

olimpiade=[]

for i in contents:
    olimpiade.append(Competitor(i.get('first_name'),
                                i.get('last_name'),
                                i.get('country'),
                                i.get('point'),))
print(olimpiade)
print("\n================================================")


#32 nxënësit më të mirë do të kalojnë në fazën e dytë. Të ndërtohet një funksion i cili kthen konkurentët e fazës së dytë.
print("\n================================================")

print(type(olimpiade))
print(type(contents))
print(type(Competitor))

for i in contents:
    

print(a)

print("\n================================================")

for i in olimpiade:
    for j in i:
        L=olimpiade.sort(key=lambda x: x.point)
print(L)

我试过这个例子

pike=[]
for value in contents:
    pike.append(value['point'])
print(pike)


n = 32
  
pike.sort()
print(pike[-n:])

应该有每个学生的成绩范围和成绩,这将帮助您筛选出最好的学生。

我已经写了如何根据你的问题制作一本有用的字典。

首先,我假设您所有的值都在一个列表中,并且每个值都是一个字符串

那就是 texts

我们可以从外部来源获取国家列表

pip install country-list
from country_list import countries_for_language
countries = dict(countries_for_language('en'))
countries = list(countries.values())

初始化空字典 - scores_dict = {}

for i in texts:
  for j in countries:
    if j in i:
      country = j
 
  score = [int(s) for s in i.split() if s.isdigit()]

  try:
    scores_dict[country].extend(score)
  except:
    scores_dict[country] = score

这会给你一个看起来像这样的字典

{'Albania': [5287],
 'Bolivia': [1666],
 'Croatia': [1201],
 'Cyprus': [8508]}

从这里开始,您可以遍历每个国家/地区以获得总体排名前 5 名的学生和每个国家/地区排名前 5 名的学生。

使用您 link 中的数据并下载到文件 'olympiad.json'

代码

import json

def best_students(lst, n=1):
    '''
        Top n students
    '''
    return sorted(lst, 
                  key = lambda d: d['point'],  # sort based upon points
                  reverse = True)[:n]          # Take n talk students

def best_students_by_country(lst, m=1):
    '''
        Top m students in each country
    '''
    # Sort by country
    by_country = sorted(lst, key = lambda d: d['country'])
    
    groups = []
           
    for d in by_country:
        if not groups:
            groups.append([])
        elif groups[-1][-1]['country'] != d['country']:
              groups.append([])    # add new country
        # Append student
        groups[-1].append(d)  # append student to new country
        
    # List comprehension for best m students in each group
    return [best_students(g, m) for g in groups]
          

用法

# Deserialize json file
with open('olympiad.json', 'r') as f:
    data = json.load(f)

# Top two students overall
print(best_students(data, 2))

# Top two students by country
print(best_students_by_country(data, 2))

输出

[{'first_name': 'Harvey',
  'last_name': 'Massey',
  'country': 'Bolivia',
  'point': 9999},
 {'first_name': 'Barbra',
  'last_name': 'Knight',
  'country': 'Equatorial Guinea',
  'point': 9998}]

[[{'first_name': 'Wade',
   'last_name': 'Dyer',
   'country': 'Afghanistan',
   'point': 9822},
  {'first_name': 'Terrell',
   'last_name': 'Martin',
   'country': 'Afghanistan',
   'point': 8875}],
 [{'first_name': 'Delaney',
   'last_name': 'Buck',
   'country': 'Albania',
   'point': 9729},
  {'first_name': 'Melton',
   'last_name': 'Ford',
   'country': 'Albania',
   'point': 9359}],
    ...

根据你的文件,我在 pandas 中创建了一个数据框。 一般排序是'sorted_all'。 'ascending=False'表示最高的数据在前。 在国家队中,墨西哥选出最佳7人。

head() 默认显示五个值。

import pandas as pd

df = pd.read_json('olympiad.json')

sorted_all = df.sort_values(by='point', ascending=False)
sorted_national = df.sort_values(['country','point'], ascending=[True, False])

print(sorted_all.head())
print(sorted_national.loc[sorted_national['country'] == 'Mexico'].head(7))

全部输出

     first_name last_name            country  point
1453     Harvey    Massey            Bolivia   9999
3666     Barbra    Knight  Equatorial Guinea   9998
5228    Rebecca   Navarro            Tunisia   9994
338      Jolene     Pratt             Mexico   9993
5322    Barnett   Herrera            Comoros   9986

输出国家墨西哥

     first_name last_name country  point
338      Jolene     Pratt  Mexico   9993
5118      Doyle   Goodman  Mexico   9980
2967      Mindy    Watson  Mexico   9510
6074      Riley      Hall  Mexico   9426
5357       Leah   Collins  Mexico   8798
5596        Luz  Bartlett  Mexico   8592
3684    Annette     Perry  Mexico   8457