TimSort 最小运行选择。为什么完美平衡的合并优于不平衡的合并？

TimSort minRun selection. Why is a perfectly balanced merge preferred over imbalanced merge?

在 TimRun document 的“计算最小运行”部分中，它给出了为 N=2112 数组选择最小运行的好例子和坏例子。它指出使用 minrun = 32 是低效的，因为

runs of lengths 2048 and 64 to merge at the end The adaptive gimmicks can do that with fewer than 2048+64 compares, but it's still more compares than necessary, and-- mergesort's bugaboo relative to samplesort --a lot more data movement (O(N) copies just to get 64 elements into place).

它还将 minrun = 32 算法描述为：

then we've got a case similar to "2112", again leaving too little work for the last merge to do

然后说选择 minrun = 33 最终会得到一个更好的完美平衡合并。

If we take minrun=33 in this case, then we're very likely to end up with 64 runs each of length 33, and then all merges are perfectly balanced. Better!

我的问题是，如果总元素数相同（示例中为 2112），为什么合并完全平衡的排序数组比不平衡的数组更好？

当 minrun=33 的总元素也是 2112 时，为什么 minrun=32“进行了比必要更多的比较”？

为什么会有“更多的数据移动”？

为什么最后一次合并“做的工作太少”？

我的理解是，合并大小为 A 和大小为 B 的排序数组需要 O(A+B)。为什么 A 和 B 大小相同时效率更高？

我画了 2 个 minrun 场景是如何执行的图表，但我仍然很困惑。

对于 2112 个元素，如果所有运行的大小都是 33，那么从 33 到 2112 合并需要 6 个步骤：33 -> 66 -> 132 -> 264 -> 528 -> 1056 -> 2112。如果所有运行大小为 32，从 32 合并到 2112 需要 7 个步骤：32 -> 64 -> 128 -> 256 -> 512 -> 1024 -> 2048 -> 2112.

如果你算一下，minrun = 33，整个数组分 6 步处理，minrun = 32，几乎整个数组（2048 个元素）分 6 步处理，然后处理整个数组在第7步。

TimSort 最小运行选择。为什么完美平衡的合并优于不平衡的合并？

TimSort minRun selection. Why is a perfectly balanced merge preferred over imbalanced merge?

sorting

algorithm

mergesort

timsort

data-structures