为什么在 __init__ 中声明一个列表两次?
Why declare a list twice in __init__?
我正在通读 Python 文档,并且在 Section 8.4.1 下,
我找到了以下 __init__
定义(缩写):
class ListBasedSet(collections.abc.Set):
''' Alternate set implementation favoring space over speed
and not requiring the set elements to be hashable. '''
def __init__(self, iterable):
self.elements = lst = []
for value in iterable:
if value not in lst:
lst.append(value)
我没有得到的部分是 self.elements = lst = []
行。为什么双重分配?
添加一些打印语句:
def __init__(self, iterable):
self.elements = lst = []
print('elements id:', id(self.elements))
print('lst id:', id(lst))
for value in iterable:
if value not in lst:
lst.append(value)
声明一个:
ListBasedSet(range(3))
elements id: 4741984136
lst id: 4741984136
Out[36]: <__main__.ListBasedSet at 0x11ab12fd0>
不出所料,它们都指向同一个 PyObject。
简洁是做这样的事情的唯一理由吗?如果不是,为什么?与重入有关?
我称之为过早优化的案例;你不会通过消除点来节省那么多,特别是对于大输入迭代;这是一些时间:
消除点:
%timeit ListBasedSet(range(3))
The slowest run took 4.06 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.05 µs per loop
%timeit ListBasedSet(range(30))
100000 loops, best of 3: 18.5 µs per loop
%timeit ListBasedSet(range(3000))
10 loops, best of 3: 119 ms per loop
同时,带点(即将 lst
替换为 self.elements
:
%timeit ListBasedSet(range(3))
The slowest run took 5.97 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.48 µs per loop
%timeit ListBasedSet(range(30))
10000 loops, best of 3: 22.8 µs per loop
%timeit ListBasedSet(range(3000))
10 loops, best of 3: 118 ms per loop
如您所见,随着我们增加输入可迭代对象的大小,时间差异几乎消失了,附加测试和成员资格测试几乎涵盖了任何收益。
我正在通读 Python 文档,并且在 Section 8.4.1 下,
我找到了以下 __init__
定义(缩写):
class ListBasedSet(collections.abc.Set):
''' Alternate set implementation favoring space over speed
and not requiring the set elements to be hashable. '''
def __init__(self, iterable):
self.elements = lst = []
for value in iterable:
if value not in lst:
lst.append(value)
我没有得到的部分是 self.elements = lst = []
行。为什么双重分配?
添加一些打印语句:
def __init__(self, iterable):
self.elements = lst = []
print('elements id:', id(self.elements))
print('lst id:', id(lst))
for value in iterable:
if value not in lst:
lst.append(value)
声明一个:
ListBasedSet(range(3))
elements id: 4741984136
lst id: 4741984136
Out[36]: <__main__.ListBasedSet at 0x11ab12fd0>
不出所料,它们都指向同一个 PyObject。
简洁是做这样的事情的唯一理由吗?如果不是,为什么?与重入有关?
我称之为过早优化的案例;你不会通过消除点来节省那么多,特别是对于大输入迭代;这是一些时间:
消除点:
%timeit ListBasedSet(range(3))
The slowest run took 4.06 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.05 µs per loop
%timeit ListBasedSet(range(30))
100000 loops, best of 3: 18.5 µs per loop
%timeit ListBasedSet(range(3000))
10 loops, best of 3: 119 ms per loop
同时,带点(即将 lst
替换为 self.elements
:
%timeit ListBasedSet(range(3))
The slowest run took 5.97 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.48 µs per loop
%timeit ListBasedSet(range(30))
10000 loops, best of 3: 22.8 µs per loop
%timeit ListBasedSet(range(3000))
10 loops, best of 3: 118 ms per loop
如您所见,随着我们增加输入可迭代对象的大小,时间差异几乎消失了,附加测试和成员资格测试几乎涵盖了任何收益。