从 n 个元素生成所有 4 元组对

Generate all 4-tuple pairs from n elements

我想生成一个包含所有可能的 4 元组对 的列表,给定一个大小为 n 的数组。 n 至少是 8,所以总能找到至少 1 对。

作为有助于理解问题的示例,我使用了问题的较小版本,2 元组对 给定大小为 5 的数组。 2 元组对的预期结果将产生 15 个项目(元组是有序的,没有重复):

[(1,2), (3,4)], [(1,2), (3,5)], [(1,2), (4,5)], [(1,3), (2,4)], [(1,3), (2,5)], [(1,3), (4,5)], [(1,4), (2,3)], [(1,4), (2,5)], [(1,4), (3,5)], [(1,5), (2,3)], [(1,5), (2,4)], [(1,5), (3,4)], [(2,3), (4,5)], [(2,4), (3,5)], [(2,5), (3,4)]

我目前的做法是使用 python 中的 itertools 并遍历 itertools.combinations 返回的所有元素,进行 2 次循环并找到 2 对不共享单个元素,然后使用该元素。

为了在 python 代码中表达这一点,我准备了一个小片段:

arr = list(range(30)) # example list 
comb = list(itertools.combinations(range(0, len(arr)), 4))

for c1 in comb:
    for c2 in comb:  # go through all possible pairs
        if len([val for val in c1 if val in c2]) == 0:  # intersection of both sets results in 0, so they don't share an element
            ... # do something and check for duplicates

此方法正在发挥作用,但由于有 2 个循环,效率低下,并且仅适用于给定时间范围内的小 n。这样做可以更有效率吗?


更新:经过一些回答后,我评估了这些建议。对于我的具体情况,最好的是 MSeifert 的(现已删除)答案提供的(扩展)算法,它执行速度最快:

def generate_four_pairs(n):
    valids = range(0, n)
    for x00, x01, x02, x03, x10, x11, x12, x13 in itertools.combinations(valids, 8):
      yield [x00, x01, x02, x03], [x10, x11, x12, x13]
      yield [x00, x01, x02, x10], [x03, x11, x12, x13]
      yield [x00, x01, x02, x11], [x03, x10, x12, x13]
      yield [x00, x01, x02, x12], [x03, x10, x11, x13]
      yield [x00, x01, x02, x13], [x03, x10, x11, x12]
      yield [x00, x01, x03, x10], [x02, x11, x12, x13]
      yield [x00, x01, x03, x11], [x02, x10, x12, x13]
      yield [x00, x01, x03, x12], [x02, x10, x11, x13]
      yield [x00, x01, x03, x13], [x02, x10, x11, x12]
      yield [x00, x01, x10, x11], [x02, x03, x12, x13]
      yield [x00, x01, x10, x12], [x02, x03, x11, x13]
      yield [x00, x01, x10, x13], [x02, x03, x11, x12]
      yield [x00, x01, x11, x12], [x02, x03, x10, x13]
      yield [x00, x01, x11, x13], [x02, x03, x10, x12]
      yield [x00, x01, x12, x13], [x02, x03, x10, x11]
      yield [x00, x02, x03, x10], [x01, x11, x12, x13]
      yield [x00, x02, x03, x11], [x01, x10, x12, x13]
      yield [x00, x02, x03, x12], [x01, x10, x11, x13]
      yield [x00, x02, x03, x13], [x01, x10, x11, x12]
      yield [x00, x02, x10, x11], [x01, x03, x12, x13]
      yield [x00, x02, x10, x12], [x01, x03, x11, x13]
      yield [x00, x02, x10, x13], [x01, x03, x11, x12]
      yield [x00, x02, x11, x12], [x01, x03, x10, x13]
      yield [x00, x02, x11, x13], [x01, x03, x10, x12]
      yield [x00, x02, x12, x13], [x01, x03, x10, x11]
      yield [x00, x03, x10, x11], [x01, x02, x12, x13]
      yield [x00, x03, x10, x12], [x01, x02, x11, x13]
      yield [x00, x03, x10, x13], [x01, x02, x11, x12]
      yield [x00, x03, x11, x12], [x01, x02, x10, x13]
      yield [x00, x03, x11, x13], [x01, x02, x10, x12]
      yield [x00, x03, x12, x13], [x01, x02, x10, x11]
      yield [x00, x10, x11, x12], [x01, x02, x03, x13]
      yield [x00, x10, x11, x13], [x01, x02, x03, x12]
      yield [x00, x10, x12, x13], [x01, x02, x03, x11]
      yield [x00, x11, x12, x13], [x01, x02, x03, x10]
      yield [x01, x02, x03, x00], [x10, x11, x12, x13]
      yield [x01, x02, x03, x10], [x00, x11, x12, x13]
      yield [x01, x02, x03, x11], [x00, x10, x12, x13]
      yield [x01, x02, x03, x12], [x00, x10, x11, x13]
      yield [x01, x02, x03, x13], [x00, x10, x11, x12]
      yield [x01, x02, x10, x00], [x03, x11, x12, x13]
      yield [x01, x02, x10, x11], [x00, x03, x12, x13]
      yield [x01, x02, x10, x12], [x00, x03, x11, x13]
      yield [x01, x02, x10, x13], [x00, x03, x11, x12]
      yield [x01, x02, x11, x00], [x03, x10, x12, x13]
      yield [x01, x02, x11, x12], [x00, x03, x10, x13]
      yield [x01, x02, x11, x13], [x00, x03, x10, x12]
      yield [x01, x02, x12, x00], [x03, x10, x11, x13]
      yield [x01, x02, x12, x13], [x00, x03, x10, x11]
      yield [x01, x02, x13, x00], [x03, x10, x11, x12]
      yield [x01, x03, x10, x00], [x02, x11, x12, x13]
      yield [x01, x03, x10, x11], [x00, x02, x12, x13]
      yield [x01, x03, x10, x12], [x00, x02, x11, x13]
      yield [x01, x03, x10, x13], [x00, x02, x11, x12]
      yield [x01, x03, x11, x00], [x02, x10, x12, x13]
      yield [x01, x03, x11, x12], [x00, x02, x10, x13]
      yield [x01, x03, x11, x13], [x00, x02, x10, x12]
      yield [x01, x03, x12, x00], [x02, x10, x11, x13]
      yield [x01, x03, x12, x13], [x00, x02, x10, x11]
      yield [x01, x03, x13, x00], [x02, x10, x11, x12]
      yield [x01, x10, x11, x00], [x02, x03, x12, x13]
      yield [x01, x10, x11, x12], [x00, x02, x03, x13]
      yield [x01, x10, x11, x13], [x00, x02, x03, x12]
      yield [x01, x10, x12, x00], [x02, x03, x11, x13]
      yield [x01, x10, x12, x13], [x00, x02, x03, x11]
      yield [x01, x10, x13, x00], [x02, x03, x11, x12]
      yield [x01, x11, x12, x00], [x02, x03, x10, x13]
      yield [x01, x11, x12, x13], [x00, x02, x03, x10]
      yield [x01, x11, x13, x00], [x02, x03, x10, x12]
      yield [x01, x12, x13, x00], [x02, x03, x10, x11]

对于一般方法,我建议使用 NPE 提供的答案,因为这是针对此问题的最短且最易读的答案。

生成所有组合对,然后丢弃几乎所有组合,因为它们包含共同元素,您正在做很多不必要的工作。

以下解决此问题的方法是首先获取四个数字的所有子集(在您的二元组示例中),然后将每个子集拆分为所有可能的对:

import itertools

def gen_pairs(n, m):
  for both_halves in itertools.combinations(xrange(1, n + 1), 2 * m):
    for first_half in itertools.combinations(both_halves, m):
      second_half = tuple(sorted(set(both_halves) - set(first_half)))
      yield [first_half, second_half]

print sorted(gen_pairs(5, 2))

请注意,这不会消除重复项(例如,[(4, 5) (2, 3)][(2, 3), (4, 5)]),因此生成的元素数量是您预期的两倍。

然而,删除重复项是微不足道的。这留作 reader.

的练习

您可以使用可能更快的排列和拆分:

array = ...
size = 4
c = itertools.permutations(array)
for t in c:
    a = []
    for i in range(0, len(t), size):
        if i + size <= len(t):
            a.append(t[i:i+size])
    yield a

注意:在数组长度不是大小的倍数的情况下,此解决方案有效但会产生重复项。

我愿意:

from itertools import combinations

sample = range(1,6)
x1 = [subset for subset in combinations(sample,2)] #getting the set of tuples
x2 = [list(subset) for subset in combinations(x1,2)] #getting the pair of tuples
x3 = [x for x in x2 if (set(x[0]) & set(x[1]) == set())] #finally filtering the tuples with no intersection

输出:

[[(1, 2), (3, 4)],
 [(1, 2), (3, 5)],
 [(1, 2), (4, 5)],
 [(1, 3), (2, 4)],
 [(1, 3), (2, 5)],
 [(1, 3), (4, 5)],
 [(1, 4), (2, 3)],
 [(1, 4), (2, 5)],
 [(1, 4), (3, 5)],
 [(1, 5), (2, 3)],
 [(1, 5), (2, 4)],
 [(1, 5), (3, 4)],
 [(2, 3), (4, 5)],
 [(2, 4), (3, 5)],
 [(2, 5), (3, 4)]]

这里是生成 MSeifert 的 yield 语句的代码:)(它只生成其中的 35 个,这意味着没有重复的:)

def g(L, n, k, A, B):
    if len(A) == k:
        return [[tuple(A), tuple([L[i] for i in xrange(0, n + 1)] + B)]]

    elif len(B) == k:
        return [[tuple([L[i] for i in xrange(0, n + 1)] + A), tuple(B)]]

    return g(L, n - 1, k, A, [L[n]] + B[0:]) + g(L, n - 1, k, [L[n]] + A[0:], B)

def f(L):
    assert(len(L) > 3 and len(L) % 2 == 0)
    return g(L, len(L) - 2, len(L) / 2, [], [L[-1]])

for i in f(['x00','x01','x02','x03','x10','x11','x12','x13']):
    print(i)