哪些类型的 Python 对象是用引用初始化的，哪些不是？

Question

我在 Windows 的 Python 3.7 中玩 sys.getrefcount。我尝试了以下方法：

>>> import sys
>>> x = "this is an arbitrary string"
>>> sys.getrefcount(x)
2

我理解其中一个引用是x，另一个是sys.getrefcount内部使用的参数。无论 x 初始化为哪种类型，这似乎都有效。但是，当我在通过之前不分配时，我注意到一些奇怪的行为：

>>> import sys
>>> sys.getrefcount("arbitrary string")
2
>>> sys.getrefcount(1122334455)
2
>>> sys.getrefcount(1122334455+1)
2
>>> sys.getrefcount(frozenset())
2
>>> sys.getrefcount(set())
1
>>> sys.getrefcount(object())
1
>>> sys.getrefcount([])
1
>>> sys.getrefcount(lambda x: x)
1
>>> sys.getrefcount(range(1122334455))
1
>>> sys.getrefcount(dict())
1
>>> sys.getrefcount(())
8341
>>> sys.getrefcount(tuple())
8340
>>> sys.getrefcount(list("arbitrary string"))
1
>>> sys.getrefcount(tuple("arbitrary string"))
1
>>> sys.getrefcount(("a", "r", "b", "i", "t", "r", "a", "r", "y", " ", "s", "t", "r", "i", "n", "g"))
2

这是怎么回事？似乎不可变类型有两个引用而可变类型只有一个？为什么看起来有些对象在传递之前就已赋值，而另一些对象只有一个引用作为参数？这和str/int/tuple拘禁有关系吗？

编辑：一个更直接的问题：为什么选择像 frozenset() 这样的不可变类型在构造时有引用，而像 set() 这样的可变类型没有？我单独理解为什么您可能会选择保留这个全局范围的参考或不全面，但为什么会出现差异？

Answer 1

一个有趣的问题所以这是一个有趣的问题 read。

你应该尝试 getrefcount(2)，对我来说它返回了 93，这意味着 CPython 为同一个内存地址保留了 93 个引用，保持第二个，所以它不必分配再次这样做，因为它是不可变的，所以这样做完全没问题。

现在让我们尝试两种不同的方法：

# first
getrefcount(set()) # returns 1

# second
s = set()
getrefcount(s) # returns 2

因为它是可变类型，所以当您创建可变类型 (set()) 时，它的行为是不同的，它将在内存中分配它并且只有一个对它的引用，在这一行结束后立即被删除.但是在第二个我们定义变量并分配它，在计算引用时我们有 s 使用的一个和函数 getrefcount.

使用的一个

而在 Python tuples are immutable 中，这就是它 returns 一个巨大数字的原因，CPython 保留了大量对空元组的引用。

Answer 2

随着我了解的更多，回答我自己的问题。

区别与 python 字节码对象的格式有关。 sys.getrefcount("arbitrary string") 的字节码如下：

>>> dis.dis('sys.getrefcount("arbitrary string")')
  1           0 LOAD_NAME                0 (sys)
              2 LOAD_METHOD              1 (getrefcount)
              4 LOAD_CONST               0 ('arbitrary string')
              6 CALL_METHOD              1
              8 RETURN_VALUE

在这里，LOAD_CONST 操作码不会从头开始构建新字符串，它只是从代码对象的常量元组中加载一个字符串。该元组是持有额外参考的内容：

>>> f = lambda: sys.getrefcount("arbitrary string")
>>> f.__code__.co_consts
(None, 'arbitrary_string')

考虑到这一点，一些例子是有道理的：

>>> import sys

# the string is stored in co_consts
>>> sys.getrefcount("arbitrary string")
2

# the integer is store in co_consts
>>> sys.getrefcount(1122334455)
2

# this addition is constant-folded and the result is stored in co_consts
>>> sys.getrefcount(1122334455+1)
2

# Tuples of constants are folded into one big constant:
>>> sys.getrefcount(("a", "r", "b", "i", "t", "r", "a", "r", "y", " ", "s", "t", "r", "i", "n", "g"))
2

同时，以下对象不能存储在 co_consts 中，因为它们每次都必须重建，因为它们要么是可变的，要么依赖于必须查看的某个函数的函数调用向上：

>>> sys.getrefcount(set()) # construct a new set at call time
1
>>> sys.getrefcount(object()) # construct a new object at call time
1
>>> sys.getrefcount([]) # construct a new list at call time
1
>>> sys.getrefcount(lambda x: x) # construct a new function at call time
1

# construct a new range object at call time.
# We could do something dumb like range = abs,
# so this can't be constant-folded.
>>> sys.getrefcount(range(1122334455))
1

>>> sys.getrefcount(dict()) # construct a new dict at call time
1
>>> sys.getrefcount(list("arbitrary string")) # construct a new list at call time
1

# Construct a new tuple at call time.
# We could do something dumb like tuple=list
# so this can't be constant-folded.
>>> sys.getrefcount(tuple("arbitrary string")) 
1

最后，对象的第三类是那些使用某种缓存或保留的对象，在这种情况下，当构造一个新对象时，Python 可以以某种方式欺骗并为您提供一个已经存在的对象.

# There is only one empty frozenset that gets re-used.
>>> sys.getrefcount(frozenset())
2

# There is only one empty tuple that gets re-used
# whenever someone requests an empty tuple
>>> sys.getrefcount(())
8341

# The same thing, but that tuple does get stored
# in co_consts because the name `tuple` could be rebound.
>>> sys.getrefcount(tuple())
8340

要验证关于 frozensets 的断言，请注意一个空的 frozenset 的引用计数增加当建造新的时：

>>> sys.getrefcount(frozenset())
2
>>> x, y, z = frozenset(), frozenset(), frozenset()
>>> sys.getrefcount(frozenset())
5

哪些类型的 Python 对象是用引用初始化的，哪些不是？

Which types of Python objects are initialized with a reference, and which are not?

python

garbage-collection

cpython

reference