为什么排序列表比未排序列表大

Question

我有一个列表my_list（列表包含utf8字符串）：

>>> len(my_list)
8777
>>> getsizeof(my_list)                     #  <-- note the size
77848

出于某种原因，排序列表 (my_sorted_list = sorted(my_list)) 使用了更多内存：

>>> len(my_sorted_list)
8777
>>> getsizeof(my_sorted_list)              #  <-- note the size
79104

为什么 sorted 返回的列表占用的内存比初始未排序列表多 space？

Answer 1

list resize operation 过度分配是为了摊销附加到列表而不是从编译器预分配的列表开始。

Answer 2

与一样，这是由于Python分配的内存比需要的多一些。这样做是为了在列表上执行 O(1) .appends。

sortedcreates a new list out of the sequence provided, sorts it in place and returns it. To create the new list, Python extends an empty sized list with the one passed；这会导致观察到的过度分配（发生在调用 list_resize 之后）。您可以使用 list.sort 来证实排序不是罪魁祸首这一事实； 使用相同的算法，但没有创建新列表（或者，众所周知，它是就地执行）。当然，那里的尺寸没有区别。

值得注意的是，这种差异主要出现在以下情况：

原始列表是使用 list-comp 创建的（其中，如果 space 可用且 the final append doesn't trigger a resize 大小较小）。
当使用列表文字时。 a PyList_New is created based on the number of values on the stack and no appends are made. Direct assigning to the underlying array is performed) 不会触发任何调整大小并将大小保持在最小值：

因此，使用 list-comp:

l = [i for i in range(10)]

getsizeof(l)          # 192
getsizeof(sorted(l))  # 200

或列表文字：

l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

getsizeof(l)          # 144
getsizeof(sorted(l))  # 200

尺寸更小（使用文字更是如此）。

通过list创建时，内存总是过度分配； Python knows the sizes 并通过根据大小过度分配位来抢占未来的修改：

l = list(range(10))

getsizeof(l)          # 200
getsizeof(sorted(l))  # 200

因此您没有观察到列表大小的差异。

作为最后的说明，我必须指出这是 特定的 Python 的 C 实现的行为，即 CPython。它详细说明了该语言是如何实现的，因此，您不应该以任何古怪的方式依赖它。

Jython、IronPython、PyPy 和任何其他实现 might/might 没有相同的行为。

为什么排序列表比未排序列表大

Why is a sorted list bigger than an unsorted list

python

sorting

list

python-3.x

python-internals