我如何通过搜索列表中的元素来使用 difflib return 列表？

Question

我有一个看起来像这样的列表列表：

list123 = [["Title a1","100 Price","Company xx aa"], ["Title b1","200 Price","Company yy bb"], ["Title c1","300 Price","Company zz cc"]]

如何通过搜索与搜索参数匹配的特定内部元素来使用 difflab.get_close_matches（或其他东西）return 整个内部列表？

我认为它会如何工作：

print(difflib.get_close_matches('Company xx a', list123))

预期输出/我想要的输出：

 ["Title a1","100 Price","Company xx aa"]

实际输出：

[]

我知道使用类似的东西：

for item in list123:
    if "Company xx aa" in item:
        print(item)

但我想使用 difflib 库（或其他库）允许更多的“人工”搜索，其中允许出现小的拼写错误。

如果我误解了函数的用途，是否有另一个可以实现我想要的功能？

Answer 1

我试过这个：

list123 = [["Title a1", "100 Price", "Company xx aa"], ["Title b1",
                                                    "200 Price", "Company yy bb"], ["Title c1", "300 Price", "Cpswdaany zsdwz cawdc"]]
for item in list123:

     print(difflib.get_close_matches("Company xx aa", item))

您必须调整函数以指定它应该具有的“人类可读性”。您也可以检查一下：

Answer 2

问题是get_closest_matches的第二个参数应该是一个字符串列表，来自documentation:

possibilities is a list of sequences against which to match word (typically a list of strings).

要解决您的问题，请执行以下操作：

import difflib


def key(choices, keyword='Company xx a'):
    matches = difflib.get_close_matches(keyword, choices)
    if matches:
        best_match, *_ = matches
        return difflib.SequenceMatcher(None, keyword, best_match).ratio()
    return 0.0


list123 = [["Title a1", "100 Price", "Company xx aa"],
           ["Title b1", "200 Price", "Company yy bb"],
           ["Title c1", "300 Price", "Company zz cc"]]

res = max(list123, key=key)

print(res)

输出

['Title a1', '100 Price', 'Company xx aa']

想法是关键函数会return每个列表的最佳匹配的相似度，然后你可以结合max使用它来找到最佳匹配的列表。

我如何通过搜索列表中的元素来使用 difflib return 列表？

How do i use difflib to return a list by searching for an element in the list?

python

search

matching

difflib