为什么 defaultdict default_factory 默认为 None?

Why does defaultdict default_factory default to None?

您不必指定默认工厂(但如果您明确传递 None 也是一样的)

>>> from collections import defaultdict
>>> defaultdict()
defaultdict(None, {})
>>> defaultdict(None)
defaultdict(None, {})

为什么 None 呢?然后我们得到这个东西:

>>> dd = defaultdict()
>>> dd[0]
# TypeError: 'NoneType' object is not callable  <-- expected behaviour
# KeyError: 0                                   <-- actual behaviour

它甚至是明确允许的,因为如果你试图从其他对象创建默认字典,defaultdict(0) 说,检查失败

TypeError: first argument must be callable or None

我认为像 lambda: None 这样的东西会是更好的默认工厂。为什么 default_factory 是可选的?我不明白用例。

当 Guido van Rossum initially proposed a DefaultDict 它有一个 默认值 (不像当前 defaultdict 使用可调用而不是值)被设置在构造期间并且是只读的(也不同于 defaultdict)。

经过一番讨论 Guidio revised the proposal。以下是相关要点:

Many, many people suggested to use a factory function instead of a default value. This is indeed a much better idea (although slightly more cumbersome for the simplest cases).

...

Let's add a generic missing-key handling method to the dict class, as well as a default_factory slot initialized to None.

...

[T]he default implementation is designed so that we can write

d = {}
d.default_factory = list

需要注意的重要一点是新功能不再属于子类。这意味着在构造函数中设置 default_factory 会破坏现有代码。因此,通过设计设置 default_factory 必须在 dict 创建之后发生。它的初始值设置为 None,现在它是一个可变属性,因此它可以被有意义地覆盖。

经过更多的讨论,我们决定也许最好不要使用 defaultdict 专业化使常规 dict 类型复杂化。

Steven Bethard 然后 asked for clarification regarding the constructor:

Should default_factory be an argument to the constructor? The three answers I see:

  • "No." I'm not a big fan of this answer. Since the whole point of creating a defaultdict type is to provide a default, requiring two statements (the constructor call and the default_factory assignment) to initialize such a dictionary seems a little inconvenient.
  • "Yes and it should be followed by all the normal dict constructor arguments." This is okay, but a few errors, like defaultdict({1:2}) will pass silently (until you try to use the dict, of course).
  • "Yes and it should be the only constructor argument." This is my favorite mainly because I think it's simple, and I couldn't think of good examples where I really wanted to do defaultdict(list, some_dict_or_iterable) or defaultdict(list, **some_keyword_args). It's also forward compatible if we need to add some of the dict constructor args in later.

吉多·范·罗森 decided that:

The defaultdict signature takes an optional positional argument which is the default_factory, defaulting to None. The remaining positional and all keyword arguments are passed to the dict constructor. IOW:

d = defaultdict(list, [(1, 2)])

is equivalent to:

d = defaultdict()  
d.default_factory = list  
d.update([(1, 2)])

请注意,当 Guido 考虑更改 dict 以提供 defaultdict 行为时,扩展代码完全反映了它的工作方式。

他还提供some justifications upthread:

Even if the default_factory were passed to the constructor, it still ought to be a writable attribute so it can be introspected and modified. A defaultdict that can't change its default factory after its creation is less useful.

本特·里希特 explains why you might want a mutable default factory:

My guess is that realistically default_factory will be used to make clean code for filling a dict, and then turning the factory off if it's to be passed into unknown contexts. Those contexts can then use old code to do as above, or if worth it can temporarily set a factory to do some work. Tightly coupled code I guess could pass factory-enabled dicts between each other.

我的猜测是,该设计是有意设计的,目的是使 defaultdict 实例在默认情况下像普通字典一样工作,同时允许稍后通过简单的属性访问动态修改行为。

例如:

>>> d = defaultdict()
>>> d['k']  # hey I'm just a plain old dict ;) 
KeyError: 'k'
>>> d.default_factory = list
>>> d['L']  # actually, I'm really a defaultdict(list)
[]
>>> d.default_factory = int  # just kidding!  I'm a counter
>>> d['i']
0
>>> d
defaultdict(int, {'L': [], 'i': 0})

并且我们可以通过将工厂设置回 None 将其重置为看起来像香草字典的东西(这将再次引发 KeyError)。

我还没有找到一种可能有用的模式,但如果强制使用一个可调用的位置参数实例化默认字典,则这种用法是不可能的。