在 for 循环期间修改可迭代的大小 - 如何确定循环？

Question

two places in the python docs (that I have found). I did try to find the source code for for loops in cpython 中提到了 For 循环，但没有用。

这是我试图理解的内容：我曾假设 for 循环是一种 while i <= len(iterable) then loop 或 if i <= len(iterable) then loop:。我不确定情况是否如此，here's 为什么：

y = [1, 2, 3, 4]
for x in y:
  print(y)
  print(y.pop(0))

Output:
[1, 2, 3, 4]
1
[2, 3, 4]
2

我知道您不应该在遍历迭代时修改它。我知道。但是，这仍然不是随机结果 - 每次此代码运行: 2 循环时都会发生。如果您改为运行 pop()，您也会得到 2 个循环。

也许更奇怪，你似乎确实得到了 len(y)+1//2 循环（至少使用 .pop()，我没有尝试太多其他测试）：

如果y = [1, 2]有一个循环
如果y = [1, 2, 3]有两个循环
如果y = [1, 2, 3, 4]有仍然两个循环
如果y = [1, 2, 3, 4, 5]有3个循环
如果y = [1, 2, 3, 4, 5, 6]还有还有三个循环
如果y = [1, 2, 3, 4, 5, 6, 7]有四个个循环

根据 Python 文档：

Note

There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, e.g. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,

for x in a[:]:
    if x < 0: a.remove(x)

任何人都可以解释 Python 在遍历在循环期间修改的可迭代对象时使用的逻辑吗？ iter 和 StopIteration、__getitem__(i) 和 IndexError 是如何计算的？那些不是列表的迭代器呢？最重要的是，这个/它在文档中的什么位置？

如@Yang K 所言：

y = [1, 2, 3, 4, 5, 6, 7]
for x in y:
  print("y: {}, y.pop(0): {}".format(y, y.pop(0)))
  print("x: {}".format(x))

# Output
y: [2, 3, 4, 5, 6, 7], y.pop(0): 1
x: 1
y: [3, 4, 5, 6, 7], y.pop(0): 2
x: 3
y: [4, 5, 6, 7], y.pop(0): 3
x: 5
y: [5, 6, 7], y.pop(0): 4
x: 7

Answer 1

循环执行直到 iterable 说它没有更多的元素。两次循环后，iterable遍历了两个元素，又丢了两个元素，这意味着它已经结束了，循环终止。

您的代码等同于：

y = [1, 2, 3, 4]
i = iter(y)
while True:
    try:
        x=next(i)
    except StopIteration:
        break
    print(y)
    print(y.pop(0))

列表迭代器保存下一个要读取的索引。在第三个循环中，列表是 [3, 4]，而 next(i) 将需要读取 y[2]，这是不可能的，因此 next 提出 StopIteration，这结束循环。

编辑至于你的其他问题：

How do iter and StopIteration, and __getitem__(i) and IndexError factor in?

前两个如上所述：它定义了 for 循环。或者，如果你愿意的话，它是 iter 的合同：它会产生东西直到它停止 StopIteration。

后两者，我认为根本不参与，因为列表迭代器是implemented in C；例如迭代器是否穷尽的检查直接将当前索引与PyList_GET_SIZE进行比较，直接查看->ob_size字段；它不再通过 Python。显然，您可以制作一个完全为纯 Python 的列表迭代器，并且您可能会使用 len 来执行检查，或者捕获 IndexError 并再次让底层C 代码针对 ->ob_size.

执行检查

What about iterators that aren't lists?

您可以将任何对象定义为可迭代的。当你调用iter(obj)时，它与调用obj.__iter__()是一样的。这应该是 return 一个迭代器，它知道如何处理 i.__next__() （这就是 next(i) 翻译的意思）。我相信通过在其键列表中建立索引来迭代（我认为，尚未检查）。如果你编写代码，你可以制作一个迭代器来做任何你想做的事情。例如：

class AlwaysEmpty:
    def __iter__(self):
        return self
    def __next__(self):
        raise StopIteration

for x in AlwaysEmpty():
    print("there was something")

不出所料，不会打印任何内容。

And most importantly, is this / where is this in the docs?

Iterator Types

在 for 循环期间修改可迭代的大小 - 如何确定循环？

Modifying size of iterable during for loop - how is looping determined?

python

mutation