比较 Python 中的两个 json 并获得匹配
Compare Two json in Python and get match
我有两个字典
第一个
Livescore
{'dictA':{'Team_name': 'Turkey - Italy', 'First_team': 6.8, 'Draw': 4.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Wales- Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Belgium - Russia', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
第二个是
Flashscore
{'dictA':{'Team_name': 'England- Italy', 'First_team': 1.8, 'Draw': 3.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Wales- Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Russia - Sweden', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
目的是获取比赛数据,例如,如果 England - Italy
同时出现在(Flashscore 和 Livescore)中,则获取此数据。
这对我来说很困难,阵容不匹配,我的意思是英格兰 - 意大利在 Flashscore 中处于 2 线,在 Livescore 中处于 4 线,当我尝试使用 if statement
时无法正确获取数据,因为行不匹配
for i in range(len(livescore)):
if flashscore[i]["Team_name"] == livescore[i]["Team_name"]
print(livescore[i]["Team_name"])
只需匹配名字 "Team_name"
您只需在两个列表中找到正确的项目即可。
for item in flashscore:
for other in livescore:
if item["Team_name"] == other["Team_name"]:
print(item, other)
不过,这是非常低效的。更好的安排是为每个字典创建一个字典。
flashindex = {x["Team_name"]: x for x in flashscore}
liveindex = {x["Team_name"]: x for x in livescore}
现在可以轻松找到两者中的所有项目:
for item in liveindex.keys():
if item in flashindex:
print(flashindex[item], liveindex[item])
您可以做更多的变体,但希望这至少能让您朝着正确的方向开始。
一种方法是为 Livescore 和 Flashscore 建立团队名称集,然后获取两者的 intersection。
live_score_names = set(item['Team_name'] for item in live_score_items)
flash_score_names = set(item['Team_name'] for item in flash_score_items)
intersection_names = live_score_names.intersection(flash_score_names)
您现在可以根据需要过滤 live_score_items 和 flash_score_names 中的项目。您知道 intersection_names 中的名称存在于两者中。
你能否更改字典,使其包含如下内容:
'Teams':{'England','Turkey'}
设置比较有意义
def prepare_score_dict(sample_list):
result = {}
for item in sample_list:
result[item.pop('Team_name')] = item
return result
livescore_dict = prepare_score_dict(Livescore)
flashscore_dict = prepare_score_dict(Flashscore)
for key in livescore_dict.keys():
if key in flashscore_dict.keys():
print(key)
试一试。
Livescore = [
{'Team_name': 'Turkey - Italy', 'First_team': 6.8, 'Draw': 4.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Wales- Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Belgium - Russia', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}]
Flashscore = [
{'Team_name': 'England- Italy', 'First_team': 1.8, 'Draw': 3.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Wales- Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Russia - Sweden', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}]
def get_match(data_1, data_2, line_name):
data1_set = set(data1[line_name] for data1 in data_1)
result_list = []
for date2 in data_2:
if date2[line_name] in data1_set:
result_list.append(date2[line_name])
return result_list
print(get_match(Livescore, Flashscore, 'Team_name'))
如果您 pandas 可以尝试以下操作:
示例数据:
import pandas as pd
df_livescore = pd.DataFrame([{'Team_name': 'Turkey - Italy', 'First_team': 6.8, 'Draw': 4.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Wales - Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Belgium - Russia', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}])
df_flashscore = pd.DataFrame([{'Team_name': 'England- Italy', 'First_team': 1.8, 'Draw': 3.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Wales - Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Russia - Sweden', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}])
代码:
df_livescore.merge(df_flashscore, on=['Team_name'])
输出:
# Team_name First_team_x ... Country_name_y Liga_y
# 0 Wales - Sweden 3.50 ... EURO 2020 Test_liga
# 1 Spain - Georgia 1.51 ... EURO 2020 Test_liga
# [2 rows x 11 columns]
我有两个字典
第一个
Livescore
{'dictA':{'Team_name': 'Turkey - Italy', 'First_team': 6.8, 'Draw': 4.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Wales- Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Belgium - Russia', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
第二个是
Flashscore
{'dictA':{'Team_name': 'England- Italy', 'First_team': 1.8, 'Draw': 3.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Wales- Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
{'dictA':{'Team_name': 'Russia - Sweden', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}}
目的是获取比赛数据,例如,如果 England - Italy
同时出现在(Flashscore 和 Livescore)中,则获取此数据。
这对我来说很困难,阵容不匹配,我的意思是英格兰 - 意大利在 Flashscore 中处于 2 线,在 Livescore 中处于 4 线,当我尝试使用 if statement
时无法正确获取数据,因为行不匹配
for i in range(len(livescore)):
if flashscore[i]["Team_name"] == livescore[i]["Team_name"]
print(livescore[i]["Team_name"])
只需匹配名字 "Team_name"
您只需在两个列表中找到正确的项目即可。
for item in flashscore:
for other in livescore:
if item["Team_name"] == other["Team_name"]:
print(item, other)
不过,这是非常低效的。更好的安排是为每个字典创建一个字典。
flashindex = {x["Team_name"]: x for x in flashscore}
liveindex = {x["Team_name"]: x for x in livescore}
现在可以轻松找到两者中的所有项目:
for item in liveindex.keys():
if item in flashindex:
print(flashindex[item], liveindex[item])
您可以做更多的变体,但希望这至少能让您朝着正确的方向开始。
一种方法是为 Livescore 和 Flashscore 建立团队名称集,然后获取两者的 intersection。
live_score_names = set(item['Team_name'] for item in live_score_items)
flash_score_names = set(item['Team_name'] for item in flash_score_items)
intersection_names = live_score_names.intersection(flash_score_names)
您现在可以根据需要过滤 live_score_items 和 flash_score_names 中的项目。您知道 intersection_names 中的名称存在于两者中。
你能否更改字典,使其包含如下内容:
'Teams':{'England','Turkey'}
设置比较有意义
def prepare_score_dict(sample_list):
result = {}
for item in sample_list:
result[item.pop('Team_name')] = item
return result
livescore_dict = prepare_score_dict(Livescore)
flashscore_dict = prepare_score_dict(Flashscore)
for key in livescore_dict.keys():
if key in flashscore_dict.keys():
print(key)
试一试。
Livescore = [
{'Team_name': 'Turkey - Italy', 'First_team': 6.8, 'Draw': 4.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Wales- Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Belgium - Russia', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}]
Flashscore = [
{'Team_name': 'England- Italy', 'First_team': 1.8, 'Draw': 3.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Wales- Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Russia - Sweden', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}]
def get_match(data_1, data_2, line_name):
data1_set = set(data1[line_name] for data1 in data_1)
result_list = []
for date2 in data_2:
if date2[line_name] in data1_set:
result_list.append(date2[line_name])
return result_list
print(get_match(Livescore, Flashscore, 'Team_name'))
如果您 pandas 可以尝试以下操作:
示例数据:
import pandas as pd
df_livescore = pd.DataFrame([{'Team_name': 'Turkey - Italy', 'First_team': 6.8, 'Draw': 4.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Wales - Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Belgium - Russia', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}])
df_flashscore = pd.DataFrame([{'Team_name': 'England- Italy', 'First_team': 1.8, 'Draw': 3.0, 'Second_team': 1.53, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Wales - Sweden', 'First_team': 3.5, 'Draw': 3.2, 'Second_team': 2.23, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Spain - Georgia', 'First_team': 1.51, 'Draw': 4.2, 'Second_team': 6.6, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'},
{'Team_name': 'Russia - Sweden', 'First_team': 1.67, 'Draw': 3.85, 'Second_team': 5.2, 'Country_name': 'EURO 2020', 'Liga': 'Test_liga'}])
代码:
df_livescore.merge(df_flashscore, on=['Team_name'])
输出:
# Team_name First_team_x ... Country_name_y Liga_y
# 0 Wales - Sweden 3.50 ... EURO 2020 Test_liga
# 1 Spain - Georgia 1.51 ... EURO 2020 Test_liga
# [2 rows x 11 columns]