如何在 Python 中计算 timsort 的最小运行长度

How to calculate the minrun length for timsort in Python

我是 Python 的新手,正在尝试编写一个 timsort 重新实现。在编写程序时,我无法弄清楚如何获得最小运行长度。我咨询过的消息来源描述了将 minrun 识别为:

minrun = N/minrun<=2^N where n is the size of the array.

我明白我想做什么,我只是不知道如何在 Python 中做到这一点?

任何想法或示例代码都会非常有用,谢谢!

在维基百科中trimsort-article the built in implementation of the python timsort是这样描述的:

Minrun is chosen from the range 32 to 64 inclusive, such that the size of the data, divided by minrun, is equal to, or slightly less than, a power of two. The final algorithm takes the six most significant bits of the size of the array, adds one if any of the remaining bits are set, and uses that result as the minrun. This algorithm works for all arrays, including those smaller than 64; for arrays of size 63 or less, this sets minrun equal to the array size and Timsort reduces to an insertion sort.

你可以这样做:

minrun = length
remaining_bits = length.bit_length() - 6

if remaining_bits > 0:
    minrun = length >> remaining_bits
    mask = (1 << remaining_bits) - 1
    if (length & mask) > 0: minrun += 1

应该是这样,对于任何其他 timsort 问题,请务必查看 python timsort