为什么 Python 允许序列超出范围的切片索引？

Question

所以我刚刚遇到了一个在我看来很奇怪的 Python 功能，并希望对此进行一些说明。

下面的数组操作有点道理：

p = [1,2,3]
p[3:] = [4] 
p = [1,2,3,4]

我想它实际上只是将这个值附加到末尾，对吗？
但是，为什么我可以这样做？

p[20:22] = [5,6]
p = [1,2,3,4,5,6]

更重要的是：

p[20:100] = [7,8]
p = [1,2,3,4,5,6,7,8]

这似乎是错误的逻辑。看起来这应该会引发错误！

有什么解释吗？
-Python 这只是一件奇怪的事吗？
-有目的吗？
-还是我想错了？

Answer 1

documentation有你的答案：

s[i:j]: slice of s from i to j (note (4))

(4) The slice of s from i to j is defined as the sequence of items with index k such that i <= k < j. If i or j is greater than len(s), use len(s). If i is omitted or None, use 0. If j is omitted or None, use len(s). If i is greater than or equal to j, the slice is empty.

documentation of IndexError 证实了这种行为：

exception IndexError

Raised when a sequence subscript is out of range. (Slice indices are silently truncated to fall in the allowed range; if an index is not an integer, TypeError is raised.)

本质上，像 p[20:100] 这样的东西正在减少到 p[len(p):len(p]。 p[len(p):len(p] 是列表末尾的空切片，将列表分配给它会修改列表的末尾以包含该列表。因此，它的工作方式类似于 appending/extending 原始列表。

此行为与将列表分配给原始列表中 任意位置 的空切片时发生的情况相同。例如：

In [1]: p = [1, 2, 3, 4]

In [2]: p[2:2] = [42, 42, 42]

In [3]: p
Out[3]: [1, 2, 42, 42, 42, 3, 4]

Answer 2

关于索引超出范围的部分问题

切片逻辑自动将索引裁剪到序列的长度。

为了方便起见，允许切片索引延伸到端点之后。必须对每个表达式进行范围检查然后手动调整限制会很痛苦，因此 Python 为您完成。

考虑希望显示不超过文本消息前 50 个字符的用例。

简单的方法（Python 现在做的）：

preview = msg[:50]

或者困难的方法（自己做限制检查）：

n = len(msg)
preview = msg[:50] if n > 50 else msg

手动实现调整端点的逻辑很容易忘记，很容易出错（在两个地方更新50），会很罗嗦，而且会很慢。 Python 将该逻辑移至其内部，使其简洁、自动、快速且正确。这是我喜欢 Python 的原因之一:-)

关于分配长度与输入长度不匹配的部分问题

OP 还想知道允许分配的基本原理，例如 p[20:100] = [7,8]，其中分配目标的长度 (80) 与替换数据长度 (2) 不同。

类比弦最容易看出动机。考虑一下，"five little monkeys".replace("little", "humongous")。请注意，目标 "little" 只有六个字母，而 "humongous" 有九个。我们可以对列表做同样的事情：

>>> s = list("five little monkeys")
>>> i = s.index('l')
>>> n = len('little')
>>> s[i : i+n ] = list("humongous")
>>> ''.join(s)
'five humongous monkeys'

这一切都是为了方便。

在引入copy()和clear()方法之前，这些曾经是流行的习语：

s[:] = []           # clear a list
t = u[:]            # copy a list

即使是现在，我们在过滤时也使用它来更新列表：

s[:] = [x for x in s if not math.isnan(x)]   # filter-out NaN values

希望这些实际示例能够很好地说明切片为何如此工作。

为什么 Python 允许序列超出范围的切片索引？

Why does Python allow out-of-range slice indexes for sequences?

python

sequence

slice

python-3.x

range-checking

关于索引超出范围的部分问题

关于分配长度与输入长度不匹配的部分问题