将一维数组的行号与 Python 中大型数据集的二维数组的第一个元素相匹配

match line number of a 1d array with first element of a 2d array for large dataset in Python

我有两个大列表 - 在这里我将仅显示一个示例来简化。

list1我有些话,

list1 = ['hello','stack','overflow']

list2中我有list1中单词的行号,以及一个标识单词类型的数值。

list2= [['0','10'],['2', '11'],['4', '12']]

我想用list2的行号

list2 = [['0','10'],['2', '11'],['4', '12']] #line numbers here are: 0,2,4

与list1对应的行,

list1 = ['hello','stack','overflow'] #correspondences found here are: hello (for list2[0]) and overflow (for list2[1])

这样我就可以得到一个包含单词及其标签的 list3。

list3 = [['hello','10'], ['overflow', '11']]

我找到了一种跨越两个列表的方法,但它非常慢,而且我认为效率不高。我该如何简化这个查找过程?

list1 = ['hello','stack','overflow']

list2= [['0','10'],['2', '11'],['4', '12']]


for i in range(0, len(list1)):
    for k in range(0, len(list2)):
        if (str(list2[k][0]) == str(i)):
            print("Found "+str(list1[i]))

Found hello Found overflow

IIUC,你可以这样做:

list1 = ['hello', 'stack', 'overflow']
list2 = [['0', '10'], ['2', '11'], ['4', '12']]

# transform the line numbers to ints
line_numbers = [(int(l), e) for l, e in list2]

# filter and compound with other number
res = [[list1[ln], other] for ln, other in line_numbers if ln < len(list1)]

print(res)

输出

[['hello', '10'], ['overflow', '11']]