如果第一个元组元素匹配,如何合并列表中的两个元组?
How to merge two tuples in a list if the first tuple elements match?
我有两个元组列表,格式如下:
playerinfo = [(ansonca01,4,1871,1,RC1),(forceda01,44,1871,1,WS3),(mathebo01,68,1871,1,FW1)]
idmatch = [(ansonca01,Anson,Cap,05/06/1871),(aaroh101,Aaron,Hank,04/13/1954),(aarot101,Aaron,Tommie,04/10/1962)]
我想知道的是,我如何遍历两个列表,如果来自 "playerinfo" 的元组中的第一个元素匹配来自 "idmatch" 的元组中的第一个元素,则合并匹配的元组一起产生一个新的元组列表?形式为:
merged_data = [(ansonca01,4,1871,1,RC1, Anson,Cap,05/06/1871),(...),(...), etc.]
新的元组列表会将 ID 号与正确玩家的名字和姓氏相匹配。
背景信息:我正在尝试合并两份棒球统计数据的 CSV 文件,但包含所有相关统计数据的文件不包含球员姓名,仅包含一个参考编号,例如'ansoc101',而第二个文件在一栏中包含参考编号,在另一栏中包含相应玩家的名字和姓氏。
CSV 的大小太大,无法手动执行此操作(大约 20,000 名玩家),所以我正在尝试自动化该过程。
使用列表推导式遍历您的列表:
[x + y[1:] for x in list1 for y in list2 if x[0] == y[0]]
我在列表中试过这个:
list1 = [("this", 1, 2, 3), ("that", 1, 2, 3), ("other", 1, 2, 3)]
list2 = [("this", 5, 6, 7), ("that", 10, 11, 12), ("notother", 1, 2, 3)]
并得到:
[('this', 1, 2, 3, 5, 6, 7), ('that', 1, 2, 3, 10, 11, 12)]
这是你想要的吗?
词典
- 迭代
playerinfo
列表并创建字典,其中键是元组中的第一项,值是所有项的列表。
- 打印第一步的结果。
- 再次迭代
idmatch
列表并检查结果字典中元组的第一项是否。如果它存在,则通过列表扩展方法用新值扩展键的值。
- 打印第二步的结果。
- 从生成的字典创建输出格式。
演示:
import pprint
playerinfo = [("ansonca01",4,1871,1,"RC1"),\
("forceda01",44,1871,1,"WS3"),\
("mathebo01",68,1871,1,"FW1")]
idmatch = [("ansonca01","Anson","Cap","05/06/1871"),\
("aaroh101","Aaron","Hank","04/13/1954"),\
("aarot101","Aaron","Tommie","04/10/1962")]
result = {}
for i in playerinfo:
result[i[0]] = list(i[:])
print "Debug Rsult1:"
pprint.pprint(result)
for i in idmatch:
if i[0] in result:
result[i[0]].extend(list(i[1:]))
print "\nDebug Rsult2:"
pprint.pprint(result)
final_rs = []
for i,j in result.items():
final_rs.append(tuple(j))
print "\nFinal result:"
pprint.pprint(final_rs)
输出:
infogrid@infogrid-vivek:~/workspace/vtestproject$ python task4.py
Debug Rsult1:
{'ansonca01': ['ansonca01', 4, 1871, 1, 'RC1'],
'forceda01': ['forceda01', 44, 1871, 1, 'WS3'],
'mathebo01': ['mathebo01', 68, 1871, 1, 'FW1']}
Debug Rsult2:
{'ansonca01': ['ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871'],
'forceda01': ['forceda01', 44, 1871, 1, 'WS3'],
'mathebo01': ['mathebo01', 68, 1871, 1, 'FW1']}
Final result:
[('ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871'),
('forceda01', 44, 1871, 1, 'WS3'),
('mathebo01', 68, 1871, 1, 'FW1')]
infogrid@infogrid-vivek:~/workspace/vtestproject$
您可以先创建一个字典以实现快速 ID 号查找,然后使用列表理解非常有效地将两个列表中的数据合并在一起:
import operator
playerinfo = [('ansonca01', 4, 1871, 1, 'RC1'),
('forceda01', 44, 1871, 1, 'WS3'),
('mathebo01', 68, 1871, 1, 'FW1')]
idmatch = [('ansonca01', 'Anson', 'Cap', '05/06/1871'),
('aaroh101', 'Aaron', 'Hank', '04/13/1954'),
('aarot101', 'Aaron', 'Tommie', '04/10/1962')]
id = operator.itemgetter(0) # To get id field.
idinfo = {id(rec): rec[1:] for rec in idmatch} # Dict for fast look-ups.
merged = [info + idinfo[id(info)] for info in playerinfo if id(info) in idinfo]
print(merged) # -> [('ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871')]
我有两个元组列表,格式如下:
playerinfo = [(ansonca01,4,1871,1,RC1),(forceda01,44,1871,1,WS3),(mathebo01,68,1871,1,FW1)]
idmatch = [(ansonca01,Anson,Cap,05/06/1871),(aaroh101,Aaron,Hank,04/13/1954),(aarot101,Aaron,Tommie,04/10/1962)]
我想知道的是,我如何遍历两个列表,如果来自 "playerinfo" 的元组中的第一个元素匹配来自 "idmatch" 的元组中的第一个元素,则合并匹配的元组一起产生一个新的元组列表?形式为:
merged_data = [(ansonca01,4,1871,1,RC1, Anson,Cap,05/06/1871),(...),(...), etc.]
新的元组列表会将 ID 号与正确玩家的名字和姓氏相匹配。
背景信息:我正在尝试合并两份棒球统计数据的 CSV 文件,但包含所有相关统计数据的文件不包含球员姓名,仅包含一个参考编号,例如'ansoc101',而第二个文件在一栏中包含参考编号,在另一栏中包含相应玩家的名字和姓氏。
CSV 的大小太大,无法手动执行此操作(大约 20,000 名玩家),所以我正在尝试自动化该过程。
使用列表推导式遍历您的列表:
[x + y[1:] for x in list1 for y in list2 if x[0] == y[0]]
我在列表中试过这个:
list1 = [("this", 1, 2, 3), ("that", 1, 2, 3), ("other", 1, 2, 3)]
list2 = [("this", 5, 6, 7), ("that", 10, 11, 12), ("notother", 1, 2, 3)]
并得到:
[('this', 1, 2, 3, 5, 6, 7), ('that', 1, 2, 3, 10, 11, 12)]
这是你想要的吗?
词典
- 迭代
playerinfo
列表并创建字典,其中键是元组中的第一项,值是所有项的列表。 - 打印第一步的结果。
- 再次迭代
idmatch
列表并检查结果字典中元组的第一项是否。如果它存在,则通过列表扩展方法用新值扩展键的值。 - 打印第二步的结果。
- 从生成的字典创建输出格式。
演示:
import pprint
playerinfo = [("ansonca01",4,1871,1,"RC1"),\
("forceda01",44,1871,1,"WS3"),\
("mathebo01",68,1871,1,"FW1")]
idmatch = [("ansonca01","Anson","Cap","05/06/1871"),\
("aaroh101","Aaron","Hank","04/13/1954"),\
("aarot101","Aaron","Tommie","04/10/1962")]
result = {}
for i in playerinfo:
result[i[0]] = list(i[:])
print "Debug Rsult1:"
pprint.pprint(result)
for i in idmatch:
if i[0] in result:
result[i[0]].extend(list(i[1:]))
print "\nDebug Rsult2:"
pprint.pprint(result)
final_rs = []
for i,j in result.items():
final_rs.append(tuple(j))
print "\nFinal result:"
pprint.pprint(final_rs)
输出:
infogrid@infogrid-vivek:~/workspace/vtestproject$ python task4.py
Debug Rsult1:
{'ansonca01': ['ansonca01', 4, 1871, 1, 'RC1'],
'forceda01': ['forceda01', 44, 1871, 1, 'WS3'],
'mathebo01': ['mathebo01', 68, 1871, 1, 'FW1']}
Debug Rsult2:
{'ansonca01': ['ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871'],
'forceda01': ['forceda01', 44, 1871, 1, 'WS3'],
'mathebo01': ['mathebo01', 68, 1871, 1, 'FW1']}
Final result:
[('ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871'),
('forceda01', 44, 1871, 1, 'WS3'),
('mathebo01', 68, 1871, 1, 'FW1')]
infogrid@infogrid-vivek:~/workspace/vtestproject$
您可以先创建一个字典以实现快速 ID 号查找,然后使用列表理解非常有效地将两个列表中的数据合并在一起:
import operator
playerinfo = [('ansonca01', 4, 1871, 1, 'RC1'),
('forceda01', 44, 1871, 1, 'WS3'),
('mathebo01', 68, 1871, 1, 'FW1')]
idmatch = [('ansonca01', 'Anson', 'Cap', '05/06/1871'),
('aaroh101', 'Aaron', 'Hank', '04/13/1954'),
('aarot101', 'Aaron', 'Tommie', '04/10/1962')]
id = operator.itemgetter(0) # To get id field.
idinfo = {id(rec): rec[1:] for rec in idmatch} # Dict for fast look-ups.
merged = [info + idinfo[id(info)] for info in playerinfo if id(info) in idinfo]
print(merged) # -> [('ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871')]