了解 Python 中大整数的内存分配

Question

Python如何为大整数分配内存？

int 类型的大小为 28 bytes，随着我不断增加 int 的值，大小以 4 bytes 的增量增加。

为什么 28 bytes 最初用于任何低至 1 的值？
为什么递增4 bytes？

PS：我是运行 Python 3.5.2 x86_64（64 位机器）。关于 (3.0+) 解释器如何处理如此庞大的数字的任何 pointers/resources/PEPs 都是我正在寻找的。

说明尺寸的代码：

>>> a=1
>>> print(a.__sizeof__())
28
>>> a=1024
>>> print(a.__sizeof__())
28
>>> a=1024*1024*1024
>>> print(a.__sizeof__())
32
>>> a=1024*1024*1024*1024
>>> print(a.__sizeof__())
32
>>> a=1024*1024*1024*1024*1024*1024
>>> a
1152921504606846976
>>> print(a.__sizeof__())
36

Answer 1

其实很简单。 Python 的 int 不是您可能从其他语言中习惯的那种原始类型，而是一个完整的对象，具有它的方法和所有东西。这就是开销的来源。

然后，您就有了有效负载本身，即所表示的整数。除了你的记忆之外，没有限制。

Python的int的大小就是表示数字所需要的大小加上一点开销。

如果您想进一步阅读，请查看 relevant part of the documentation:

Integers have unlimited precision

Answer 2

Why 28 bytes initially for any value as low as 1?

我相信 completely; Python uses C structs to represent objects in the Python world, any objects including ints:

struct _longobject {
    PyObject_VAR_HEAD
    digit ob_digit[1];
};

PyObject_VAR_HEAD is a macro that when expanded adds another field in the struct (field PyVarObject which is specifically used for objects that have some notion of length) and, ob_digits 是一个包含数字值的数组。样板大小来自该结构，用于小和大 Python 数字。

Why increments of 4 bytes?

因为，当创建更大的数字时，大小（以字节为单位）是 sizeof(digit) 的倍数；您可以在 _PyLong_New 中看到，新 longobject 的内存分配是通过 PyObject_MALLOC:

执行的

/* Number of bytes needed is: offsetof(PyLongObject, ob_digit) +
   sizeof(digit)*size.  Previous incarnations of this code used
   sizeof(PyVarObject) instead of the offsetof, but this risks being
   incorrect in the presence of padding between the PyVarObject header
   and the digits. */
if (size > (Py_ssize_t)MAX_LONG_DIGITS) {
    PyErr_SetString(PyExc_OverflowError,
                    "too many digits in integer");
    return NULL;
}
result = PyObject_MALLOC(offsetof(PyLongObject, ob_digit) +
                         size*sizeof(digit));

^{offsetof(PyLongObject, ob_digit) 是与保持其值无关的长对象的 'boiler-plate'（以字节为单位）。}

digit 在包含 struct _longobject 的头文件中定义为 typedef for uint32:

typedef uint32_t digit;

和 sizeof(uint32_t) 是 4 字节。这是当 _PyLong_New 的 size 参数增加时，您将看到字节大小增加的量。

当然，这正是 CPython 选择的实施方式。这是一个实现细节，因此您不会在 PEP 中找到太多信息。如果您能找到相应的线程，python-dev 邮件列表将进行实现讨论 :-)。

无论哪种方式，您都可能会在其他流行的实现中发现不同的行为，所以不要认为这是理所当然的。

了解 Python 中大整数的内存分配

Understanding memory allocation for large integers in Python

python

int

python-3.x

python-internals