为什么 Python2.7 dict 使用的 space 比 Python3 dict 多?
Why does Python2.7 dict use more space than Python3 dict?
我读过 Raymond Hettinger's new method of implementing compact dicts。这解释了为什么 Python 3.6 中的指令比 Python 2.7-3.5 中的指令使用更少的内存。然而,Python 2.7 和 3.3-3.5 字典中使用的内存似乎有所不同。测试代码:
import sys
d = {i: i for i in range(n)}
print(sys.getsizeof(d))
- Python 2.7: 12568
- Python 3.5: 6240
- Python 3.6: 4704
如前所述,我了解 3.5 和 3.6 之间的节省,但很好奇 2.7 和 3.5 之间节省的原因。
原来这是一条红鲱鱼。增加字典大小的规则在 cPython 2.7 - 3.2 和 cPython 3.3 之间发生了变化,并在 cPython 3.4 再次发生变化(尽管这种变化只适用于删除发生时)。我们可以使用以下代码来确定字典何时展开:
import sys
size_old = 0
for n in range(512):
d = {i: i for i in range(n)}
size = sys.getsizeof(d)
if size != size_old:
print(n, size_old, size)
size_old = size
Python 2.7:
(0, 0, 280)
(6, 280, 1048)
(22, 1048, 3352)
(86, 3352, 12568)
Python 3.5
0 0 288
6 288 480
12 480 864
22 864 1632
44 1632 3168
86 3168 6240
Python 3.6:
0 0 240
6 240 368
11 368 648
22 648 1184
43 1184 2280
86 2280 4704
请记住,当字典达到 2/3 满时会调整大小,我们可以看到 cPython 2.7 字典实现在扩展时大小翻了四倍,而 cPython 3.5 /3.6 dict 实现的大小只有两倍。
这在 dict source code 的评论中有解释:
/* GROWTH_RATE. Growth rate upon hitting maximum load.
* Currently set to used*2 + capacity/2.
* This means that dicts double in size when growing without deletions,
* but have more head room when the number of deletions is on a par with the
* number of insertions.
* Raising this to used*4 doubles memory consumption depending on the size of
* the dictionary, but results in half the number of resizes, less effort to
* resize.
* GROWTH_RATE was set to used*4 up to version 3.2.
* GROWTH_RATE was set to used*2 in version 3.3.0
*/
我读过 Raymond Hettinger's new method of implementing compact dicts。这解释了为什么 Python 3.6 中的指令比 Python 2.7-3.5 中的指令使用更少的内存。然而,Python 2.7 和 3.3-3.5 字典中使用的内存似乎有所不同。测试代码:
import sys
d = {i: i for i in range(n)}
print(sys.getsizeof(d))
- Python 2.7: 12568
- Python 3.5: 6240
- Python 3.6: 4704
如前所述,我了解 3.5 和 3.6 之间的节省,但很好奇 2.7 和 3.5 之间节省的原因。
原来这是一条红鲱鱼。增加字典大小的规则在 cPython 2.7 - 3.2 和 cPython 3.3 之间发生了变化,并在 cPython 3.4 再次发生变化(尽管这种变化只适用于删除发生时)。我们可以使用以下代码来确定字典何时展开:
import sys
size_old = 0
for n in range(512):
d = {i: i for i in range(n)}
size = sys.getsizeof(d)
if size != size_old:
print(n, size_old, size)
size_old = size
Python 2.7:
(0, 0, 280)
(6, 280, 1048)
(22, 1048, 3352)
(86, 3352, 12568)
Python 3.5
0 0 288
6 288 480
12 480 864
22 864 1632
44 1632 3168
86 3168 6240
Python 3.6:
0 0 240
6 240 368
11 368 648
22 648 1184
43 1184 2280
86 2280 4704
请记住,当字典达到 2/3 满时会调整大小,我们可以看到 cPython 2.7 字典实现在扩展时大小翻了四倍,而 cPython 3.5 /3.6 dict 实现的大小只有两倍。
这在 dict source code 的评论中有解释:
/* GROWTH_RATE. Growth rate upon hitting maximum load.
* Currently set to used*2 + capacity/2.
* This means that dicts double in size when growing without deletions,
* but have more head room when the number of deletions is on a par with the
* number of insertions.
* Raising this to used*4 doubles memory consumption depending on the size of
* the dictionary, but results in half the number of resizes, less effort to
* resize.
* GROWTH_RATE was set to used*4 up to version 3.2.
* GROWTH_RATE was set to used*2 in version 3.3.0
*/