"Lengthless"快速排序算法

Question

我一直在研究排序算法。到目前为止，我发现的所有排序算法都依赖于已知长度（几乎所有排序算法。我不能使用它们，因为 "proper" 长度是 O(n)），或者比快速排序慢（例如插入排序）。

在Lua中，有2个长度的概念：

适当的序列长度
- 是O(n)
- ipairs 等使用
序列长度
- 是 O(log n)
- 有漏洞（零值）
- 被table.insert等

我研究过堆排序，但堆排序需要构建一个堆，然后排序。它不会将两者都作为一个操作来完成，这意味着它仍然存在 O(n) 长度问题。

使用插入排序，您只需运行插入排序算法，直到找到第一个 nil。这仅对 table 的 "proper sequence" 部分进行排序（即从 1 到 n 的键没有任何 nil 值），但插入排序比快速排序慢。

是否有像插入排序一样不依赖于长度，但性能与快速排序相当的就地排序算法？

示例插入排序代码（在维基百科的帮助下）：

function isort(t)
  -- In-place insertion sort that never uses the length operator.
  -- Stops at the first nil, as expected. Breaks if you use "for i = 1, #t do"
  for i in ipairs(t) do
      local j = i
      while j > 1 and t[j-1] > t[j] do
          t[j], t[j-1] = t[j-1], t[j]
          j = j - 1
      end
  end
end

local t = {6, 5, 3, 1, 7, 2, 4, nil, 1, 1, 8, 3, 4, nil, nil, 1}
isort(t)
io.write("{")
if #t > 0 then
  io.write(tostring(t[1]))
  for i=2, #t do
    io.write(", ")
    io.write(tostring(t[i]))
  end
end
io.write("}\n")
-- stdout:
-- {1, 2, 3, 4, 5, 6, 7, nil, 1, 1, 8, 3, 4, nil, nil, 1}

Answer 1

由于排序本身必须至少花费 O(n log n)，额外的 O(n) 扫描似乎不会使算法无效。使用插入或冒泡排序等二次算法是错误的经济。

您可以使用 heapsort 变体，您只需迭代地插入到不断增长的堆中，而不是使用 O(n) buildheap 算法。 Heapsort 绝对是 O(n log n)，即使你增量构建堆，但我怀疑它是否与 quicksort 竞争。（对于大输入，尤其是逆序的大输入，它绝对可以与插入排序相媲美。）

您可以在维基百科中看到 pseudocode for standard heapsort。我下面的伪代码不同之处在于它不需要数组的大小作为参数，而是 returns 作为结果。它还使用基于 1 的向量而不是基于 0 的向量，因为您使用的是 Lua，因此假设 a 从 a[1] 到 a[count] 到运行 count 的某个值。

 procedure heapsort(a):
     input: an array of comparable elements
     output: the number of elements in the array.

     (Heapify successive prefixes of the array)
     count ← 1
     while a has an element indexed by count:
         siftup(a, count)
         count ← count + 1
     count ← count - 1

     (Extract the sorted list from the heap)
     i ← count
     while i > 1:
         swap(a, 1, i)
         i ← i - 1
         siftdown(a, i)

     return count

siftup 和 siftdown 是标准的堆函数，这里以基于 1 的版本呈现。提供的代码使用标准优化，其中筛选是通过单次旋转而不是一系列交换完成的；这显着减少了数组引用的数量。（heapsort 过程中的 swap 可以集成到 siftdown 中以节省一点额外费用，但它会掩盖算法。如果您想使用此优化，请将 val ← a[1] 更改为val ← a[count + 1]; a[count + 1] ← a[1] 并从 heapsort 中删除 swap。）

在基于 1 的堆中，节点 i 的父节点是节点 floor(i/2)，节点 i 的子节点是节点 2i 和 2i+1.回想一下，堆约束要求每个节点都不少于其父节点。（那产生一个minheap，用来产生降序，如果要升序，就需要一个maxheap，就是把下面的三值比较从>改成<。）

procedure siftup(a, count):
    input: a vector of length count, of which the first count - 1
           elements satisfy the heap constraint.
    result: the first count elements of a satisfy the heap constraint.

    val ← a[count]
    loop:
        parent ← floor(count / 2)
        if parent == 0 or val > a[parent]:
            a[count] ← val
            return
        else
            a[count] ← a[parent]
            count ← parent

procedure siftdown(a, count):
    input: a vector of length count which satisfies the heap constraint
           except for the first element.
    result: the first count elements of a satisfy the heap constraint.

    val ← a[1]
    parent ← 1
    loop :
        child ← 2 * parent
        if child < count and a[child] > a[child + 1]:
            child ← child + 1
        if count < child or not (val > a[child]):
            a[parent] ← val
            return
        else
            a[parent] ← a[child]
            parent ← child

"Lengthless"快速排序算法

"Lengthless" fast sorting algorithm

sorting

algorithm

lua

in-place

time-complexity