基数排序和 O(N log N) 效率

Radix Sort & O(N log N) Efficiency

我最近一直在学习基数排序，我使用的来源之一是维基百科页面。目前有以下关于算法效率的段落：

The topic of the efficiency of radix sort compared to other sorting algorithms is somewhat tricky and subject to quite a lot of misunderstandings. Whether radix sort is equally efficient, less efficient or more efficient than the best comparison-based algorithms depends on the details of the assumptions made. Radix sort complexity is O(wn) for n keys which are integers of word size w. Sometimes w is presented as a constant, which would make radix sort better (for sufficiently large n) than the best comparison-based sorting algorithms, which all perform O(n log n) comparisons to sort n keys. However, in general w cannot be considered a constant: if all n keys are distinct, then w has to be at least log n for a random-access machine to be able to store them in memory, which gives at best a time complexity O(n log n). That would seem to make radix sort at most equally efficient as the best comparison-based sorts (and worse if keys are much longer than log n).

粗体部分成为了我无法逾越的障碍。我知道一般基数排序是 O(wn)，并且通过其他来源已经看到 O(n) 是如何实现的，但不太理解为什么 n 个不同的键需要 O(n log n) 时间来随机存储 -接入机。我相当确定它归结为一些简单的数学，但不幸的是，我仍然无法掌握扎实的理解。

我最接近的尝试如下：

给定一个基数 'B' 和该基数中的一个数 'N'，'N' 的最大位数是：

(logB of N) + 1.

如果给定列表 L 中的每个数字都是唯一的，那么我们最多有：

L *((logB of N) + 1) possibilities

此时我不确定如何进行。

有没有人能用粗体扩展上面的部分并分解为什么 n 个不同的键需要至少 log n 的随机访问存储？

假设 MSB 基数排序具有常量 m 个分箱：

对于必须容纳至少 n 个不同值的任意大数据类型，所需的位数为 N = ceiling(log2(n))
因此存储每个值所需的内存量也是O(log n)；假设顺序内存访问，读/写一个值的时间复杂度是O(N) = O(log n)，尽管可以使用指针代替
位数是O(N / m) = O(log n)
重要的是，每个连续的数字必须相差 2 的幂，即 m 也必须是 2 的幂；假设这对于硬件平台来说足够小，例如4 位数字 = 16 个 bin

排序中：

对于每个radix pass，其中有O(log n):
1. 计算每个桶：使用位运算获取当前数字的值 - O(1) 用于所有 n 值。应该注意每个计数器也必须是 N 位，尽管增加 1 将被（摊销）O(1)。如果我们使用非 2 的幂数，这通常是 O(log n log log n) ( source )
2. 使桶计数数组累加：必须执行m - 1次加法，每次加法为O(N) = O(log n)（不同于增量特例）
3. 写入输出数组：遍历n个值，再次确定bin，并写入具有正确偏移量的指针

因此总复杂度为O(log n) * [ n * O(1) + m * O(log n) + n * O(1) ] = O(n log n).

基数排序和 O(N log N) 效率

Radix Sort & O(N log N) Efficiency

sorting

big-o

radix-sort