我需要找到 n 个最好的学生,每个国家的 m 个最好的学生将组成他们的国家组
I need to find n best students overall and m best students of each country will group together to form their national group
我获得了 32 个最好的分数。现在我正在尝试获取 32 个最好的学生的索引,以便我可以显示他们是谁。
我的 json 文件的 link 在这里:
https://drive.google.com/file/d/1OOkX1hAD6Ot-I3h_DUM2gRqdSl5Hy2Pl/view
代码如下:
import json
file_path = "C:/Users/User/Desktop/leksion 10/testim/u2/olympiad.json"
with open(file_path, 'r') as j:
contents = json.loads(j.read())
print(contents)
print("\n================================================")
class Competitor:
def __init__(self, first_name, last_name, country, point):
self.first_name = first_name
self.last_name = last_name
self.country= country
self.point = int(point)
def __repr__(self):
return f'{self.first_name} {self.last_name} {self.country} {self.point}'
olimpiade=[]
for i in contents:
olimpiade.append(Competitor(i.get('first_name'),
i.get('last_name'),
i.get('country'),
i.get('point'),))
print(olimpiade)
print("\n================================================")
#32 nxënësit më të mirë do të kalojnë në fazën e dytë. Të ndërtohet një funksion i cili kthen konkurentët e fazës së dytë.
print("\n================================================")
print(type(olimpiade))
print(type(contents))
print(type(Competitor))
for i in contents:
print(a)
print("\n================================================")
for i in olimpiade:
for j in i:
L=olimpiade.sort(key=lambda x: x.point)
print(L)
我试过这个例子
pike=[]
for value in contents:
pike.append(value['point'])
print(pike)
n = 32
pike.sort()
print(pike[-n:])
应该有每个学生的成绩范围和成绩,这将帮助您筛选出最好的学生。
我已经写了如何根据你的问题制作一本有用的字典。
首先,我假设您所有的值都在一个列表中,并且每个值都是一个字符串
那就是 texts
我们可以从外部来源获取国家列表
pip install country-list
from country_list import countries_for_language
countries = dict(countries_for_language('en'))
countries = list(countries.values())
初始化空字典 - scores_dict = {}
for i in texts:
for j in countries:
if j in i:
country = j
score = [int(s) for s in i.split() if s.isdigit()]
try:
scores_dict[country].extend(score)
except:
scores_dict[country] = score
这会给你一个看起来像这样的字典
{'Albania': [5287],
'Bolivia': [1666],
'Croatia': [1201],
'Cyprus': [8508]}
从这里开始,您可以遍历每个国家/地区以获得总体排名前 5 名的学生和每个国家/地区排名前 5 名的学生。
使用您 link 中的数据并下载到文件 'olympiad.json'
代码
import json
def best_students(lst, n=1):
'''
Top n students
'''
return sorted(lst,
key = lambda d: d['point'], # sort based upon points
reverse = True)[:n] # Take n talk students
def best_students_by_country(lst, m=1):
'''
Top m students in each country
'''
# Sort by country
by_country = sorted(lst, key = lambda d: d['country'])
groups = []
for d in by_country:
if not groups:
groups.append([])
elif groups[-1][-1]['country'] != d['country']:
groups.append([]) # add new country
# Append student
groups[-1].append(d) # append student to new country
# List comprehension for best m students in each group
return [best_students(g, m) for g in groups]
用法
# Deserialize json file
with open('olympiad.json', 'r') as f:
data = json.load(f)
# Top two students overall
print(best_students(data, 2))
# Top two students by country
print(best_students_by_country(data, 2))
输出
[{'first_name': 'Harvey',
'last_name': 'Massey',
'country': 'Bolivia',
'point': 9999},
{'first_name': 'Barbra',
'last_name': 'Knight',
'country': 'Equatorial Guinea',
'point': 9998}]
[[{'first_name': 'Wade',
'last_name': 'Dyer',
'country': 'Afghanistan',
'point': 9822},
{'first_name': 'Terrell',
'last_name': 'Martin',
'country': 'Afghanistan',
'point': 8875}],
[{'first_name': 'Delaney',
'last_name': 'Buck',
'country': 'Albania',
'point': 9729},
{'first_name': 'Melton',
'last_name': 'Ford',
'country': 'Albania',
'point': 9359}],
...
根据你的文件,我在 pandas 中创建了一个数据框。
一般排序是'sorted_all'。 'ascending=False'表示最高的数据在前。
在国家队中,墨西哥选出最佳7人。
head() 默认显示五个值。
import pandas as pd
df = pd.read_json('olympiad.json')
sorted_all = df.sort_values(by='point', ascending=False)
sorted_national = df.sort_values(['country','point'], ascending=[True, False])
print(sorted_all.head())
print(sorted_national.loc[sorted_national['country'] == 'Mexico'].head(7))
全部输出
first_name last_name country point
1453 Harvey Massey Bolivia 9999
3666 Barbra Knight Equatorial Guinea 9998
5228 Rebecca Navarro Tunisia 9994
338 Jolene Pratt Mexico 9993
5322 Barnett Herrera Comoros 9986
输出国家墨西哥
first_name last_name country point
338 Jolene Pratt Mexico 9993
5118 Doyle Goodman Mexico 9980
2967 Mindy Watson Mexico 9510
6074 Riley Hall Mexico 9426
5357 Leah Collins Mexico 8798
5596 Luz Bartlett Mexico 8592
3684 Annette Perry Mexico 8457
我获得了 32 个最好的分数。现在我正在尝试获取 32 个最好的学生的索引,以便我可以显示他们是谁。 我的 json 文件的 link 在这里:
https://drive.google.com/file/d/1OOkX1hAD6Ot-I3h_DUM2gRqdSl5Hy2Pl/view
代码如下:
import json
file_path = "C:/Users/User/Desktop/leksion 10/testim/u2/olympiad.json"
with open(file_path, 'r') as j:
contents = json.loads(j.read())
print(contents)
print("\n================================================")
class Competitor:
def __init__(self, first_name, last_name, country, point):
self.first_name = first_name
self.last_name = last_name
self.country= country
self.point = int(point)
def __repr__(self):
return f'{self.first_name} {self.last_name} {self.country} {self.point}'
olimpiade=[]
for i in contents:
olimpiade.append(Competitor(i.get('first_name'),
i.get('last_name'),
i.get('country'),
i.get('point'),))
print(olimpiade)
print("\n================================================")
#32 nxënësit më të mirë do të kalojnë në fazën e dytë. Të ndërtohet një funksion i cili kthen konkurentët e fazës së dytë.
print("\n================================================")
print(type(olimpiade))
print(type(contents))
print(type(Competitor))
for i in contents:
print(a)
print("\n================================================")
for i in olimpiade:
for j in i:
L=olimpiade.sort(key=lambda x: x.point)
print(L)
我试过这个例子
pike=[]
for value in contents:
pike.append(value['point'])
print(pike)
n = 32
pike.sort()
print(pike[-n:])
应该有每个学生的成绩范围和成绩,这将帮助您筛选出最好的学生。
我已经写了如何根据你的问题制作一本有用的字典。
首先,我假设您所有的值都在一个列表中,并且每个值都是一个字符串
那就是 texts
我们可以从外部来源获取国家列表
pip install country-list
from country_list import countries_for_language
countries = dict(countries_for_language('en'))
countries = list(countries.values())
初始化空字典 - scores_dict = {}
for i in texts:
for j in countries:
if j in i:
country = j
score = [int(s) for s in i.split() if s.isdigit()]
try:
scores_dict[country].extend(score)
except:
scores_dict[country] = score
这会给你一个看起来像这样的字典
{'Albania': [5287],
'Bolivia': [1666],
'Croatia': [1201],
'Cyprus': [8508]}
从这里开始,您可以遍历每个国家/地区以获得总体排名前 5 名的学生和每个国家/地区排名前 5 名的学生。
使用您 link 中的数据并下载到文件 'olympiad.json'
代码
import json
def best_students(lst, n=1):
'''
Top n students
'''
return sorted(lst,
key = lambda d: d['point'], # sort based upon points
reverse = True)[:n] # Take n talk students
def best_students_by_country(lst, m=1):
'''
Top m students in each country
'''
# Sort by country
by_country = sorted(lst, key = lambda d: d['country'])
groups = []
for d in by_country:
if not groups:
groups.append([])
elif groups[-1][-1]['country'] != d['country']:
groups.append([]) # add new country
# Append student
groups[-1].append(d) # append student to new country
# List comprehension for best m students in each group
return [best_students(g, m) for g in groups]
用法
# Deserialize json file
with open('olympiad.json', 'r') as f:
data = json.load(f)
# Top two students overall
print(best_students(data, 2))
# Top two students by country
print(best_students_by_country(data, 2))
输出
[{'first_name': 'Harvey',
'last_name': 'Massey',
'country': 'Bolivia',
'point': 9999},
{'first_name': 'Barbra',
'last_name': 'Knight',
'country': 'Equatorial Guinea',
'point': 9998}]
[[{'first_name': 'Wade',
'last_name': 'Dyer',
'country': 'Afghanistan',
'point': 9822},
{'first_name': 'Terrell',
'last_name': 'Martin',
'country': 'Afghanistan',
'point': 8875}],
[{'first_name': 'Delaney',
'last_name': 'Buck',
'country': 'Albania',
'point': 9729},
{'first_name': 'Melton',
'last_name': 'Ford',
'country': 'Albania',
'point': 9359}],
...
根据你的文件,我在 pandas 中创建了一个数据框。 一般排序是'sorted_all'。 'ascending=False'表示最高的数据在前。 在国家队中,墨西哥选出最佳7人。
head() 默认显示五个值。
import pandas as pd
df = pd.read_json('olympiad.json')
sorted_all = df.sort_values(by='point', ascending=False)
sorted_national = df.sort_values(['country','point'], ascending=[True, False])
print(sorted_all.head())
print(sorted_national.loc[sorted_national['country'] == 'Mexico'].head(7))
全部输出
first_name last_name country point
1453 Harvey Massey Bolivia 9999
3666 Barbra Knight Equatorial Guinea 9998
5228 Rebecca Navarro Tunisia 9994
338 Jolene Pratt Mexico 9993
5322 Barnett Herrera Comoros 9986
输出国家墨西哥
first_name last_name country point
338 Jolene Pratt Mexico 9993
5118 Doyle Goodman Mexico 9980
2967 Mindy Watson Mexico 9510
6074 Riley Hall Mexico 9426
5357 Leah Collins Mexico 8798
5596 Luz Bartlett Mexico 8592
3684 Annette Perry Mexico 8457