如何在 python 中对字符串元素的排序列表应用二进制搜索?

how to apply binary search in python on sorted list of string elements?

我有一个字符串元素(城市名称)的排序列表,我想对此进行二进制搜索并通过给出首字母过滤掉城市?

例如用户输入:http://127.0.0.1:8000/api/?city=New

所以在这种情况下,我需要找出从 New

开始的城市

示例输出:

[
"New Abbey|Ceredigion|United Kingdom",
"New Albany|Indiana|United States",
"New Albany|Kansas|United States",
"New Albany|Mississippi|United States",
"New Albany|Ohio|United States"
]

请指教

您可以使用 list comprehension 来过滤您想要的项目:

[x for x in cities if x.startswith('New')]

如果您想在 python 中实现二进制搜索,那么这可能对您有所帮助。

def binarySearch(alist, item):
    first = 0
    last = len(alist)-1
    found = False

    while first<=last and not found:
         midpoint = (first + last)//2
         if alist[midpoint] == item:
             found = True
         else:
             if item < alist[midpoint]:
                 last = midpoint-1
             else:
                 first = midpoint+1

    return found

testlist = [0, 1, 2, 8, 13, 17, 19, 32, 42,]
print(binarySearch(testlist, 3))    
print(binarySearch(testlist, 13))

来源:http://interactivepython.org/runestone/static/pythonds/SortSearch/TheBinarySearch.html

以下方法应该有效。它使用 Python 自己的名为 bisect 的二进制搜索库来查找列表中的初始索引。对于搜索词 New 它 returns 2 对于我的示例列表。 itertools.takewhile 然后可以用于 return 个条目,直到您的搜索词失败:

import bisect, itertools

locations = [
    "Aaaa|aaaa|Test",
    "Bbbb|bbbb|Test",
    "New Abbey|Ceredigion|United Kingdom",
    "New Albany|Indiana|United States",
    "New Albany|Kansas|United States",
    "New Albany|Mississippi|United States",
    "New Albany|Ohio|United States",
    "Zzzz|zzzz|Test"
    ]

search = "New"
start_index = bisect.bisect_left(locations, search)
print list(itertools.takewhile(lambda x: x.startswith(search), itertools.islice(locations, start_index, None)))

给出以下输出:

['New Abbey|Ceredigion|United Kingdom', 'New Albany|Indiana|United States', 'New Albany|Kansas|United States', 'New Albany|Mississippi|United States', 'New Albany|Ohio|United States']