当其中一个是无限时如何获得两个迭代的笛卡尔积

How to get Cartesian product of two iterables when one of them is infinite

假设我有两个可迭代对象,一个是有限的,一个是无限的:

import itertools

teams = ['A', 'B', 'C']
steps = itertools.count(0, 100)

我想知道我是否可以避免嵌套的 for 循环并使用 itertools 模块中的无限迭代器之一,如 cyclerepeat 来获得这些的笛卡尔积迭代器。

循环应该是无限的,因为 steps 的停止值预先未知。

预期输出:

$ python3 test.py  
A 0
B 0
C 0
A 100
B 100
C 100
A 200
B 200
C 200
etc...

嵌套循环的工作代码:

from itertools import count, cycle, repeat

STEP = 100 
LIMIT = 500
TEAMS = ['A', 'B', 'C']


def test01():
    for step in count(0, STEP):
        for team in TEAMS:
            print(team, step)
        if step >= LIMIT:  # Limit for testing
            break

test01()

尝试itertools.product

from itertools import product
for i, j in product(range(0, 501, 100), 'ABC'):
    print(j, i)

正如文档所说 product(A, B) 等同于 ((x,y) for x in A for y in B)。 如您所见,product 生成一个元组,这意味着它是一个生成器,不会在内存中创建列表以便正常工作。

This function is roughly equivalent to the following code, except that the actual implementation does not build up intermediate results in memory:

def product(*args, **kwds):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = map(tuple, args) * kwds.get('repeat', 1)
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

但是由于 known issue:

,您不能使用 itertools.product 进行无限循环

According to the documentation, itertools.product is equivalent to nested for-loops in a generator expression. But, itertools.product(itertools.count(2010)) is not.

>>> import itertools
>>> (year for year in itertools.count(2010))
<generator object <genexpr> at 0x026367D8>
>>> itertools.product(itertools.count(2010))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError

The input to itertools.product must be a finite sequence of finite iterables.

对于无限循环,可以使用this code

有一种不用嵌套循环的方法。基于 你可以这样写:

from itertools import count, chain, cycle, tee 

teams = ['A', 'B', 'C']
steps = count(0, 100)

for team, step in zip(cycle(teams), chain.from_iterable(zip(*tee(steps, len(teams))))):
    if step == 300:
        break
    print(team, step)

这将给出预期的输出:

A 0
B 0
C 0
A 100
B 100
C 100
A 200
B 200
C 200

它完成了工作,但它的可读性要差得多。