合并排序导致 CLRS 错误(第 3 版)

Mergesort resulting in error in CLRS (3rd edition)

根据 CLRS(3rd edition) ch2 第 31-34 页,这是 mergemerge_sort.

的伪代码和各自的代码

def merge(a, p, q, r):
    n1 = q - p + 1
    n2 = r - q
    left, right = [], []
    for i in range(1, n1):
        left.append(a[p + i - 1])
    for j in range(1, n2):
        right.append(a[q + j])
    left.append(float('inf'))
    right.append(float('inf'))
    i, j = 1, 1
    for k in range(p, r):
        if left[i] <= right[j]:
            a[k] = left[i]
            i += 1
        else:
            a[k] = right[j]
            j += 1

def merge_sort(a, p, r):
    if p < r:
        q = (p + r) // 2
        merge_sort(a, p, q)
        merge_sort(a, q + 1, r)
        merge(a, p, q, r)

使用方法如下:

if __name__ == '__main__':
    size = 100
    s = [random.randint(0, size) for _ in range(size)]
    print(s)
    merge_sort(s, 1, size)

结果:

[28, 4, 63, 29, 86, 36, 58, 87, 89, 23, 99, 38, 13, 22, 87, 77, 51, 62, 27, 35, 81, 23, 96, 92, 46, 21, 24, 44, 35, 54, 93, 61, 87, 1, 99, 44, 53, 77, 49, 52, 71, 12, 79, 60, 53, 74, 84, 92, 97, 8, 77, 92, 72, 33, 38, 61, 84, 13, 21, 62, 70, 88, 42, 59, 72, 48, 26, 12, 96, 78, 61, 99, 49, 38, 6, 37, 84, 17, 6, 39, 19, 15, 96, 71, 6, 92, 10, 61, 1, 83, 24, 57, 69, 78, 40, 6, 78, 84, 79, 14]
Traceback (most recent call last):
  File "algorithms/ch2/merge_sort.py", line 36, in <module>
    merge_sort(s, 1, size)
  File "algorithms/ch2/merge_sort.py", line 27, in merge_sort
    merge_sort(a, p, q)
  File "algorithms/ch2/merge_sort.py", line 27, in merge_sort
    merge_sort(a, p, q)
  File "algorithms/ch2/merge_sort.py", line 27, in merge_sort
    merge_sort(a, p, q)
  [Previous line repeated 3 more times]
  File "algorithms/ch2/merge_sort.py", line 29, in merge_sort
    merge(a, p, q, r)
  File "algorithms/ch2/merge_sort.py", line 16, in merge
    if left[i] <= right[j]:
IndexError: list index out of range

代码有问题吗?我有什么遗漏吗?

从伪代码到python代码的转换存在多个问题:

  • 伪代码中for i = 1 to n1,n1包含在内,所以你应该使用range(1, n1 + 1)

  • 对于 的相同注释对于 k = p 到 r,使用 range(p, r + 1)

  • 该算法使用无限保护值 (),这是一个在代码中表现不佳的数学概念。如果列表确实包含无限浮点值,算法将失败。您应该改为在合并循环中测试索引值,而不是依赖保护值。

  • 列表索引值在 C 和大多数编程语言中从 0 开始,因此初始调用 mergesort(s, 1, size) 是不一致的,除非您从所有索引值中减去 1在下标表达式中。

算法可以更好地形式化为基于 0 的循环并排除上限。 pseudo-code 令人困惑,这本书对程序员来说不实用。您应该考虑以您选择的语言使用实用方法分析算法的书籍。

这是修改后的版本:

# merge 2 adjacent slices from array a:
# a[p:q] and a[q:r]
def merge(a, p, q, r):
    n1 = q - p
    n2 = r - q
    left = a[p : q]
    right = a[q : r]
    i, j = 0, 0
    for k in range(p, r):
        if j >= n2 or (i < n1 and left[i] <= right[j]):
            a[k] = left[i]
            i += 1
        else:
            a[k] = right[j]
            j += 1

def merge_sort(a, p, r):
    if r - p > 1:
        q = (p + r) // 2
        merge_sort(a, p, q)
        merge_sort(a, q, r)
        merge(a, p, q, r)

if __name__ == '__main__':
    size = 100
    s = [random.randint(0, size) for _ in range(size)]
    print(s)
    merge_sort(s, 0, size)
    print(s)