使用 for 循环的相同词对齐

Question

我正在尝试对齐两个列表中的单词

sentence1 = ['boy','motorcycle','people','play']
sentence2 = ['run','boy','people','boy','play','play']

这是我的代码：

def identicalWordsIndex(self, sentence1, sentence2):
    identical_index = []
    for i in xrange(len(sentence1)):
        for j in xrange(len(sentence2)):
            if sentence1[i] == sentence2[j]:
                idenNew1 = [i,j]
                identical_index.append(idenNew1)
            if sentence2[j] == sentence1[i]:
                idenNew2 = [j,i]
                identical_index.append(idenNew2)
    return identical_index

我想做的是从 sentence1 和 sentence2 中获取对齐词的索引号。

1st 是从 sentence1 到 sentence2 的对齐词索引。第二个是从 sentence2 到 sentence1 的对齐词索引。

但是上面代码的结果是这样的：

1st : [[0, 1], [1, 0], [0, 3], [3, 0], [2, 2], [2, 2], [3, 4], [4, 3], [3, 5], [5, 3]]
2nd : [[0, 1], [1, 0], [0, 3], [3, 0], [2, 2], [2, 2], [3, 4], [4, 3], [3, 5], [5, 3]]

我期望的结果是这样的：

1st : [[0,1],[2,2],[3,4]]
2nd : [[1,0],[2,2],[3,0],[4,3],[5,3]]

谁能解决？谢谢

Answer 1

您只需要添加休息时间。试试这个：

sentence1 = ['boy','motorcycle','people','play']
sentence2 = ['run','boy','people','boy','play','play']
identical_index = []

def identicalWordsIndex( sentence1, sentence2):
    identical_index = []
    for i in xrange(len(sentence1)):
        for j in xrange(len(sentence2)):
            if sentence1[i] == sentence2[j]:
                idenNew1 = [i,j]
                identical_index.append(idenNew1)
                break
    return identical_index

print (identicalWordsIndex(sentence1, sentence2))
print (identicalWordsIndex(sentence2, sentence1))

打印：

[[0, 1], [2, 2], [3, 4]]

[[1, 0], [2, 2], [3, 0], [4, 3], [5, 3]]

Answer 2

您可以使用 for loops 尝试此解决方案：

a = ['boy','motorcycle','people','play']
b = ['run','boy','people','boy','play','play']

def align_ab(a, b):
    indexed = []
    for k,v in enumerate(a):
        try:
            i = b.index(v)
            indexed.append([k,i])
        except ValueError:
            pass

    return indexed
# Align a words from b
print(align_ab(a,b))
# Align b words from a
print(align_ab(b,a))

输出：

>>> [[0, 1], [2, 2], [3, 4]]
>>> [[1, 0], [2, 2], [3, 0], [4, 3], [5, 3]]

Answer 3

看看这是否适合你。在最后两行，你可以交换参数来得到你想要的。

sentence1 = ['boy','motorcycle','people','play']
sentence2 = ['run','boy','people','boy','play','play']

def identicalWordsIndex(sentence1, sentence2):
    identical_index = []
    for i in range(len(sentence1)):
        for j in range(len(sentence2)):
            if sentence1[i] == sentence2[j]:
            identical_index.append([i, j])
                break
    return identical_index

print(identicalWordsIndex(sentence1, sentence2))
print(identicalWordsIndex(sentence2, sentence1))

输出：

>>>[[0, 1], [2, 2], [3, 4]]
>>>[[1, 0], [2, 2], [3, 0], [4, 3], [5, 3]]

使用 for 循环的相同词对齐

identical word alignment using for loop

python

for-loop

list

alignment

sentence