寻找最接近字符串的回文
Finding closest palindrome to string
我正在尝试编写一个 python 程序来找到最接近单词的回文。我可以在字符串的任何部分添加一个字母,从字符串的任何部分删除一个字母,或者更改字符串任何部分的一个字母。我一直在研究使用 levenshtein 距离来查找 post Edit Distance in Python 中两个单词之间所需的最小编辑次数。但我不确定如何以编程方式找到需要最少编辑次数的回文。我正在努力实现的一些例子:
palindrome('hello') = 'ollo'
#you can remove the h and then turn the e into an o, giving a palindrome in 2 steps
levenshtein('hello',palindrome('hello')) = 2
palindrome('test') = 'tet'
#you can remove the s to get a palindrome
levenshtein('test',palindrome('test')) = 1
palindrome('tart') = 'trart'
#adding an r or removing the r produces a palindrome, but both solutions only require 1 edit so either would be acceptable.
levenshtein('tart',palindrome('tart')) = 1
我能够使用链接 post 中的 levenshtein 代码找到两个字符串之间的距离。我需要帮助编写一个 palindrome() 函数,该函数接受一个字符串和 returns 最接近该字符串的回文。
这是我的 DFS 实现。它只搜索到 word length-2
的距离,因为过去它变得有点微不足道(除了一个字母以外的所有字母,将每个字母都更改为相同的)。
它会找到该距离限制内的所有回文,并按距离对它们进行排序。
import time
word = "hello"
visited_cost = {} # to keep track of non-palindromes we've considered
palindrome_cost = {} # to store actual palindromes we've found
def find_palindrome(w, dist=0 ):
# Don't go on forever
if len(w) == 0 or len(w) > len(word) + 2 or dist > len(word) - 2:
return []
# Don't retry potential palindromes that we've tried before
global visited_cost
if w in visited_cost:
if dist >= visited_cost[w]:
return []
visited_cost[w] = dist
# check if we've found a palindrome
if (reverse(w)) == w:
if w in palindrome_cost:
palindrome_cost[w] = min(palindrome_cost[w], dist)
else:
palindrome_cost[w] = dist
return [w]
palindromes = []
if len(w) > 1:
for x in drop_one(w):
palindromes += find_palindrome(x, dist+1)
for x in add_one(w):
palindromes += find_palindrome(x, dist+1)
for x in change_one(w):
palindromes += find_palindrome(x, dist+1)
return palindromes
# For a word w, gives all possible words obtained by dropping one letter
def drop_one(w):
return [w[:i]+w[i+1:] for i in range(len(w))]
# For a word w, gives all possible words obtained by inserting a capital X
# at any position in the word. Of course "X" could be any letter
def add_one(w):
return [w[:i]+"X"+ w[i:] for i in range(len(w))]
# For a word w, gives all possible words obtained by changing one letter
# to another letter occurring in the word
def change_one(w):
return [w[:i] +j + w[i + 1:] for i in range(len(w)) for j in w]
def reverse(s):
return "".join(reversed(s))
t0 = time.time()
results = set(find_palindrome(word))
s = sorted(results, key = lambda x: palindrome_cost[x])
for x in s:
print x, palindrome_cost[x]
print "Found %i palindromes based on '%s' in %.3f seconds" %\
(len(results), word, time.time() - t0)
输出:
ollo 2
olllo 2
elle 2
heleh 2
hllh 2
oeleo 2
hlllh 2
[...]
Found 46 palindromes based on 'hello' in 0.065 seconds
我正在尝试编写一个 python 程序来找到最接近单词的回文。我可以在字符串的任何部分添加一个字母,从字符串的任何部分删除一个字母,或者更改字符串任何部分的一个字母。我一直在研究使用 levenshtein 距离来查找 post Edit Distance in Python 中两个单词之间所需的最小编辑次数。但我不确定如何以编程方式找到需要最少编辑次数的回文。我正在努力实现的一些例子:
palindrome('hello') = 'ollo'
#you can remove the h and then turn the e into an o, giving a palindrome in 2 steps
levenshtein('hello',palindrome('hello')) = 2
palindrome('test') = 'tet'
#you can remove the s to get a palindrome
levenshtein('test',palindrome('test')) = 1
palindrome('tart') = 'trart'
#adding an r or removing the r produces a palindrome, but both solutions only require 1 edit so either would be acceptable.
levenshtein('tart',palindrome('tart')) = 1
我能够使用链接 post 中的 levenshtein 代码找到两个字符串之间的距离。我需要帮助编写一个 palindrome() 函数,该函数接受一个字符串和 returns 最接近该字符串的回文。
这是我的 DFS 实现。它只搜索到 word length-2
的距离,因为过去它变得有点微不足道(除了一个字母以外的所有字母,将每个字母都更改为相同的)。
它会找到该距离限制内的所有回文,并按距离对它们进行排序。
import time
word = "hello"
visited_cost = {} # to keep track of non-palindromes we've considered
palindrome_cost = {} # to store actual palindromes we've found
def find_palindrome(w, dist=0 ):
# Don't go on forever
if len(w) == 0 or len(w) > len(word) + 2 or dist > len(word) - 2:
return []
# Don't retry potential palindromes that we've tried before
global visited_cost
if w in visited_cost:
if dist >= visited_cost[w]:
return []
visited_cost[w] = dist
# check if we've found a palindrome
if (reverse(w)) == w:
if w in palindrome_cost:
palindrome_cost[w] = min(palindrome_cost[w], dist)
else:
palindrome_cost[w] = dist
return [w]
palindromes = []
if len(w) > 1:
for x in drop_one(w):
palindromes += find_palindrome(x, dist+1)
for x in add_one(w):
palindromes += find_palindrome(x, dist+1)
for x in change_one(w):
palindromes += find_palindrome(x, dist+1)
return palindromes
# For a word w, gives all possible words obtained by dropping one letter
def drop_one(w):
return [w[:i]+w[i+1:] for i in range(len(w))]
# For a word w, gives all possible words obtained by inserting a capital X
# at any position in the word. Of course "X" could be any letter
def add_one(w):
return [w[:i]+"X"+ w[i:] for i in range(len(w))]
# For a word w, gives all possible words obtained by changing one letter
# to another letter occurring in the word
def change_one(w):
return [w[:i] +j + w[i + 1:] for i in range(len(w)) for j in w]
def reverse(s):
return "".join(reversed(s))
t0 = time.time()
results = set(find_palindrome(word))
s = sorted(results, key = lambda x: palindrome_cost[x])
for x in s:
print x, palindrome_cost[x]
print "Found %i palindromes based on '%s' in %.3f seconds" %\
(len(results), word, time.time() - t0)
输出:
ollo 2
olllo 2
elle 2
heleh 2
hllh 2
oeleo 2
hlllh 2
[...]
Found 46 palindromes based on 'hello' in 0.065 seconds