为什么 splatting 在 rhs 上创建一个元组,而在 lhs 上创建一个列表?

Why does splatting create a tuple on the rhs but a list on the lhs?

考虑,例如,

squares = *map((2).__rpow__, range(5)),
squares
# (0, 1, 4, 9, 16)

*squares, = map((2).__rpow__, range(5))
squares
# [0, 1, 4, 9, 16]

因此,在其他条件相同的情况下,我们在左轴上展开时得到一个列表,在右轴上展开时得到元组。

为什么?

这是设计使然吗?如果是,原因是什么?或者,如果没有,是否有任何技术原因?还是就是这样,没有什么特别的原因?

不是一个完整的答案,但反汇编提供了一些线索:

from dis import dis

def a():
    squares = (*map((2).__rpow__, range(5)),)
    # print(squares)

print(dis(a))

反汇编为

  5           0 LOAD_GLOBAL              0 (map)
              2 LOAD_CONST               1 (2)
              4 LOAD_ATTR                1 (__rpow__)
              6 LOAD_GLOBAL              2 (range)
              8 LOAD_CONST               2 (5)
             10 CALL_FUNCTION            1
             12 CALL_FUNCTION            2
             14 BUILD_TUPLE_UNPACK       1
             16 STORE_FAST               0 (squares)
             18 LOAD_CONST               0 (None)
             20 RETURN_VALUE

def b():
    *squares, = map((2).__rpow__, range(5))
print(dis(b))

结果

 11           0 LOAD_GLOBAL              0 (map)
              2 LOAD_CONST               1 (2)
              4 LOAD_ATTR                1 (__rpow__)
              6 LOAD_GLOBAL              2 (range)
              8 LOAD_CONST               2 (5)
             10 CALL_FUNCTION            1
             12 CALL_FUNCTION            2
             14 UNPACK_EX                0
             16 STORE_FAST               0 (squares)
             18 LOAD_CONST               0 (None)
             20 RETURN_VALUE

doc on UNPACK_EX 状态:

UNPACK_EX(counts)

Implements assignment with a starred target: Unpacks an iterable in TOS into individual values, where the total number of values can be smaller than the number of items in the iterable: one of the new values will be a list of all leftover items.

The low byte of counts is the number of values before the list value, the high byte of counts the number of values after it. The resulting values are put onto the stack right-to-left.

(强调我的)。而 BUILD_TUPLE_UNPACK returns 一个 tuple:

BUILD_TUPLE_UNPACK(count)

Pops count iterables from the stack, joins them in a single tuple, and pushes the result. Implements iterable unpacking in tuple displays (*x, *y, *z).

这是在PEP-0448 disadvantages

中指定的

Whilst *elements, = iterable causes elements to be a list, elements = *iterable, causes elements to be a tuple. The reason for this may confuse people unfamiliar with the construct.

也根据:PEP-3132 specification

This PEP proposes a change to iterable unpacking syntax, allowing to specify a "catch-all" name which will be assigned a list of all items not assigned to a "regular" name.

这里也提到了:Python-3 exprlists

Except when part of a list or set display, an expression list containing at least one comma yields a tuple.
The trailing comma is required only to create a single tuple (a.k.a. a singleton); it is optional in all other cases. A single expression without a trailing comma doesn’t create a tuple, but rather yields the value of that expression. (To create an empty tuple, use an empty pair of parentheses: ().)

这也可以在此处的一个更简单的示例中看到,其中列表中的元素

In [27]: *elements, = range(6)                                                                                                                                                      

In [28]: elements                                                                                                                                                                   
Out[28]: [0, 1, 2, 3, 4, 5]

在这里,元素是一个元组

In [13]: elements = *range(6),                                                                                                                                                      

In [14]: elements                                                                                                                                                                   
Out[14]: (0, 1, 2, 3, 4, 5)

从评论和其他答案中我可以理解:

  • 第一个行为是将 in-line 与函数中使用的现有 arbitrary argument lists 保持一致,即 *args

  • 第二个行为是能够在评估中进一步使用 LHS 上的变量,因此将其设为列表,可变值而不是元组更有意义

PEP 3132 -- Extended Iterable Unpacking末尾有说明原因:

Acceptance

After a short discussion on the python-3000 list [1], the PEP was accepted by Guido in its current form. Possible changes discussed were:

[...]

Make the starred target a tuple instead of a list. This would be consistent with a function's *args, but make further processing of the result harder.

[1] https://mail.python.org/pipermail/python-3000/2007-May/007198.html

因此,使用可变列表而不是不可变元组的优势似乎就是原因。

对于 RHS,问题不大。 answer here 说得好:

We have it working as it usually does in function calls. It expands the contents of the iterable it is attached to. So, the statement:

elements = *iterable

can be viewed as:

elements = 1, 2, 3, 4,

which is another way for a tuple to be initialized.

现在,对于 LHS, 是的,如围绕 the initial PEP 3132 for extending unpacking

的讨论所示,LHS 使用列表存在技术原因

原因可以从PEP上的对话中得知(最后补充)。

本质上归结为几个关键因素:

  • LHS 需要支持 "starred expression" 不一定仅限于末尾。
  • RHS 需要允许接受各种序列类型,包括迭代器。
  • 以上两点的结合需要manipulation/mutation的内容接受到加星的表达式后
  • 另一种处理方法是模仿 RHS 上的迭代器,甚至将实施困难放在一边,但由于其不一致的行为而被 Guido 否决。
  • 鉴于以上所有因素,LHS 上的元组必须先是列表,然后再转换。这种方法只会增加开销,不会引起任何进一步的讨论。

总结:各种因素的结合导致决定允许在 LHS 上列出一个列表,并且原因相互影响。


禁止不一致类型的相关摘录:

The important use case in Python for the proposed semantics is when you have a variable-length record, the first few items of which are interesting, and the rest of which is less so, but not unimportant. (If you wanted to throw the rest away, you'd just write a, b, c = x[:3] instead of a, b, c, *d = x.) It is much more convenient for this use case if the type of d is fixed by the operation, so you can count on its behavior.

There's a bug in the design of filter() in Python 2 (which will be fixed in 3.0 by turning it into an iterator BTW): if the input is a tuple, the output is a tuple too, but if the input is a list or anything else, the output is a list. That's a totally insane signature, since it means that you can't count on the result being a list, nor on it being a tuple -- if you need it to be one or the other, you have to convert it to one, which is a waste of time and space. Please let's not repeat this design bug. -Guido


我还尝试重新创建与上述摘要相关的部分引用的对话。Source 强调我的。

1.

In argument lists, *args exhausts iterators, converting them to tuples. I think it would be confusing if *args in tuple unpacking didn't do the same thing.

This brings up the question of why the patch produces lists, not tuples. What's the reasoning behind that?

STeVe

2.

IMO, it's likely that you would like to further process the resulting sequence, including modifying it.

Georg

3.

Well if that's what you're aiming at, then I'd expect it to be more useful to have the unpacking generate not lists, but the same type you started with, e.g. if I started with a string, I probably want to continue using strings:: --additional text snipped off

4.

When dealing with an iterator, you don't know the length in advance, so the only way to get a tuple would be to produce a list first and then create a tuple from it. Greg

5.

Yep. That was one of the reasons it was suggested that the *args should only appear at the end of the tuple unpacking.

STeVe

情侣对话已跳过

6.

I don't think that returning the type given is a goal that should be attempted, because it can only ever work for a fixed set of known types. Given an arbitrary sequence type, there is no way of knowing how to create a new instance of it with specified contents.

-- Greg

跳过会议

7.

I'm suggesting, that:

  • lists return lists
  • tuples return tuples
  • XYZ containers return XYZ containers
  • non-container iterables return iterators.

您建议如何区分后两种情况? 尝试切片并捕获异常是不可接受的, IMO,因为它很容易掩盖错误。

-- 格雷格

8.

But I expect less useful. It won't support "a, *b, c = " either. From an implementation POV, if you have an unknown object on the RHS, you have to try slicing it before you try iterating over it; this may cause problems e.g. if the object happens to be a defaultdict -- since x[3:] is implemented as x[slice(None, 3, None)], the defaultdict will give you its default value. I'd much rather define this in terms of iterating over the object until it is exhausted, which can be optimized for certain known types like lists and tuples.

-- --Guido van Rossum

你在 RHS 上得到一个元组的事实与 splat 无关。 splat 只是解压你的 map 迭代器。你将它解压成的内容取决于你使用元组语法的事实:

*whatever,

代替列表语法:

[*whatever]

或设置语法:

{*whatever}

你本可以得到一个列表或一组。您刚刚告诉 Python 创建一个元组。


在 LHS 上,splatted 赋值目标总是生成一个列表。是否使用"tuple-style"

并不重要
*target, = whatever

或"list-style"

[*target] = whatever

目标列表的语法。语法看起来很像创建列表或元组的语法,但目标列表语法是完全不同的东西。

您在左侧使用的语法是在 PEP 3132 中引入的,以支持

这样的用例
first, *rest = iterable

在拆包分配中,可迭代对象的元素按位置分配给未加星标的目标,如果有加星标的目标,则将任何额外内容塞入列表并分配给该目标。 A list was chosen instead of a tuple to make further processing easier。由于在您的示例中 只有 一个加星标的目标,所有项目都会进入分配给该目标的 "extras" 列表。

TLDR:你在 RHS 上得到了一个 tuple,因为你要求了一个。您在 LHS 上获得了 list,因为它更容易。


请务必记住,RHS 在 LHS 之前进行评估 - 这就是 a, b = b, a 起作用的原因。当拆分分配并使用 LHS 和 RHS 的附加功能时,差异就会变得明显:

# RHS: Expression List
a = head, *tail
# LHS: Target List
*leading, last = a

简而言之,虽然两者看起来相似,但它们是完全不同的东西。 RHS 是从 all 名称创建 one tuple 的表达式 - LHS 是对 multiple[ 的绑定=54=] 来自 one tuple 的名字。即使您将 LHS 视为名称的元组,也不会限制每个名称的类型。


RHS 是 expression list - tuple 文字,没有可选的 () 括号。这与 1, 2 即使没有括号也创建一个元组的方式相同,以及 []{} 如何创建 listset*tail 只是意味着将 解包到 这个 tuple.

New in version 3.5: Iterable unpacking in expression lists, originally proposed by PEP 448.

LHS 不创建一个值,它将值绑定到多个名称。对于 catch-all 名称,例如 *leading,绑定在所有情况下都是未知的 up-front。相反,catch-all 包含剩余的内容。

使用 list 来存储值使这变得简单 - 尾随名称的值可以有效地从末尾删除。剩余的 list 然后包含 catch-all 名称的值。事实上,这正是 CPython does:

  • collect all items for mandatory targets before the starred one
  • collect all remaining items from the iterable in a list
  • pop items for mandatory targets after the starred one from the list
  • push the single items and the resized list on the stack

即使 LHS 有一个 catch-all 没有尾随名称的名称,它也是 list 以保持一致性。

使用a = *b,:

如果你这样做:

a = *[1, 2, 3],

它会给出:

(1, 2, 3)

因为:

  1. 解包和其他一些东西默认给出元组,但如果你说即

[*[1, 2, 3]]

输出:

[1, 2, 3] 作为 list 因为我做了 list,所以 {*[1, 2, 3]} 会给出 set.

  1. 解包给出了三个元素,对于 [1, 2, 3] 它真的只是

1, 2, 3

输出:

(1, 2, 3)

这就是拆包的作用。

主要部分:

解包只需执行:

1, 2, 3

对于:

[1, 2, 3]

哪个是元组:

(1, 2, 3)

实际上这会创建一个列表,并将其更改为一个元组。

使用*a, = b

嗯,这真的会是:

a = [1, 2, 3]

因为它不是:

*a, b = [1, 2, 3]

或者类似的东西,这个就不多说了

  1. 相当于没有*,,不完全,它总是给出一个列表。

  2. 这实际上几乎只用于多个变量,即:

*a, b = [1, 2, 3]

一件事是无论它存储什么列表类型:

>>> *a, = {1,2,3}
>>> a
[1, 2, 3]
>>> *a, = (1,2,3)
>>> a
[1, 2, 3]
>>> 

如果有:

也很奇怪
a, *b = 'hello'

并且:

print(b)

成为:

'ello'

那就不像泼溅了

而且list功能比其他的多,更容易操作。

可能没有理由发生这种情况,这确实是 Python 中的决定。

a = *b, 部分是有原因的,在“主要部分:”部分。

总结:

也正如@Devesh 在PEP 0448 disadvantages 中提到的那样:

Whilst *elements, = iterable causes elements to be a list, elements = *iterable, causes elements to be a tuple. The reason for this may confuse people unfamiliar with the construct.

(强调我的)

何必呢,这对我们来说并不重要,如果你想要一个列表,为什么不直接使用下面的:

print([*a])

或者一个元组:

print((*a))

还有一组:

print({*a})

等等...