这个算法找到最长回文子串的时间复杂度是多少？

Question

这是 Python 代码：

def is_palindrome(s):
    return s == s[::-1]


def longestp(s):
    if is_palindrome(s):
        return s

    maxp = s[0]

    for i in range(len(s)-1):
        half_length = len(maxp) // 2
        start = i - half_length
        end = i + half_length

        while start >= 0 and end <= len(s)-1:
            if is_palindrome(s[start:end+2]):
                if len(s[start:end+2]) > len(maxp):
                    maxp = s[start:end+2]
                end += 1
            elif is_palindrome(s[start:end+1]):
                if len(s[start:end+1]) > len(maxp):
                    maxp = s[start:end+1]
                start -= 1
                end += 1
            else:
                break

    return maxp

我最初认为它是 O(n^3) 因为有两个嵌套循环和字符串切片，但在我的测试中它几乎是线性的。该算法是否有任何类型的输入会变慢？

Answer 1

绝对不是线性的。尝试使用包含大量回文但不是回文的输入：

>>> timeit.timeit('longestp(x)', 'x="a"*100000+"b"', globals=globals(), number=1)
5.5123205203562975
>>> timeit.timeit('longestp(x)', 'x="a"*10000+"b"', globals=globals(), number=1)
0.08460151217877865

切片和 s == s[::-1] 比解释 Python 代码有更好的常数因子，您需要确保内部循环没有提前 breaking。这些影响可能会影响您通过计时来判断时间复杂度的尝试。

我也不认为它是 O(n^3)。由于 break 条件，嵌套循环不会按照您直观预期的方式进行交互。内部循环在整个算法过程中执行 O(n) 次迭代，因为在有限次数的迭代之后，要么 len(maxp) 增长，要么循环 breaks。这个算法对我来说看起来 worst-case O(n^2)。

Answer 2

算法看起来好像需要的总时间正比于

integral_0^N x dx = [(x^2)/2]_0^N = (N^2)/2 = O(N^2)

匹配 ab* 的字符串应该给出最坏情况下的行为。

这是一段代码，kind-of 通过实验证明了最坏情况下的行为。

结构如下：

定义 worstCase 函数构造 "bad" 个长度为 N
测量你的函数在这些字符串上的时间
创建 log(N) 与 log(time(N))
拟合一条直线，尝试估计直线的斜率：这是 O(N^p) 中的指数 p。

代码如下：

def worstCase(length):
  return "a" + "b" * (length - 1)

from time import clock
from math import log

xs = []
ys = []
for n in [4 * int(1000 * 1.2 ** n) for n in range(1, 20)]:
  s = worstCase(n)
  assert len(s) == n
  startTime = clock()
  p = longestp(s)
  endTime = clock()
  assert p == s[1:]
  t = endTime - startTime
  xs.append(log(n))
  ys.append(log(t))
  print("%d -> %f" % (n, endTime - startTime))

from numpy import polyfit

exponent, constant = polyfit(xs, ys, 1)

print("Exponent was: %f" % (exponent))

这是输出（需要一两分钟）：

4800 -> 0.057818
5760 -> 0.078123
6908 -> 0.105169
8292 -> 0.145572
9952 -> 0.197657
11940 -> 0.276103
14332 -> 0.382668
17196 -> 0.534682
20636 -> 0.747468
24764 -> 1.048267
29720 -> 1.475469
35664 -> 2.081608
42796 -> 2.939904
51356 -> 4.216063
61628 -> 5.963550
73952 -> 8.691849
88744 -> 12.126039
106492 -> 19.684188
127788 -> 24.942766
Exponent was: 1.867208

它估计指数约为 1.86，比 3 更接近 2。

这个算法找到最长回文子串的时间复杂度是多少？

What's the time complexity of this algorithm finding the longest palindromic substring?

python

string

substring

palindrome

python-3.x