根据可能的对创建组合

Creating combinations based on possible pairs

我有 n 个索引导致 n(n-1)/2 成对组合,例如n=3

(i,j,k) -> (i,j), (i,k), (j,k)

现在我知道这些对中的每一对的可能性,例如

(i,j) = (1,2), (1,3), (2,2)
(i,k) = (2,2), (1,2), (2,4)
(j,k) = (1,2), (4,3), (2,2)

换句话说,在某些组合中 (i,j,k) 我们必须有 (i,j)(1,2)(1,3)(2,2) 并且其他相同对。我希望构造所有可能的组合,所以在上面的例子中只有两种可能的组合:

(i,j,k) = (2,2,2)
(i,j,k) = (1,2,2)

我目前已按如下方式实施此程序:

import numpy as np

ij = np.array(([1,2], [1,3], [2,2]))
ik = np.array(([2,2], [1,2], [2,4]))
jk = np.array(([1,2], [4,3], [2,2]))

possibilities = []

possible_i = np.union1d(ij[:,0], ik[:,0])
possible_j = np.union1d(ij[:,1], jk[:,0])
possible_k = np.union1d(ik[:,1], jk[:,1])

for i in possible_i:
    for j in possible_j:

        if ([i,j] == ij).all(1).any():     
            for k in possible_k:
                if (([i,k] == ik).all(1).any() and
                    ([j,k] == jk).all(1).any()):
                    print(i,j,k)

虽然这有效并且可以很容易地适应任何 n,但它对我来说似乎不是很有效,例如它检查组合:

1 2 2
1 2 2
1 2 3
1 2 4
1 3 2
1 3 3
1 3 4
2 2 2
2 2 2
2 2 3
2 2 4

当然,我们检查过(i,j,k) = (i,2,3)是无效的,就不用再检查这个形式的其他组合了。有没有更有效的方法来解决这个任务(也适用于更高的作品n)?

我们用长度为 n 的列表表示可能的组合。我们尚未获得任何信息的索引将包含 None.

每一轮都会处理一对索引,并检查这对的所有规则。

如果该对的第一个值存在于上一轮的可能组合中,而第二个值从未被触及(None),我们将其添加为新的可能组合转.

如果两个值都存在于之前的组合中,这证实它可能有效,我们也添加它。

我们可以放弃上一回合的结果,因为我们之前认为可能但在这一回合尚未验证的组合是不可能的。

所以,代码:

from itertools import combinations

def possible_combs(n, pairs_list):
    # Pairs of indices, generated in the same order as the lists of allowed pairs
    indices = combinations(range(n), r=2)
    # Current list of possible combinations. None means no information for this index
    current = [[None] * n]

    for (first, last), allowed in zip(indices, pairs_list):
        previous = current
        current = []
        # Iteration on each allowed pair for the current pair of indices
        for i, j in allowed:
            for comb in previous:
                if comb[first] is None:
                    # We can have previous combinations having None for the starting index 
                    # only during the first step. In this case, we create the path. 
                    new = comb[:]
                    new[first] = i
                    new[last] = j
                    current.append(new)
                if comb[first] == i:
                    if comb[last] is None:
                        # A path leading to a yet unknown value, we add it
                        new = comb[:]
                        new[last] = j
                        current.append(new)
                    elif comb[last] == j:
                        # A valid path, we keep it
                        current.append(comb[:])
                    # At this point, any previous combination that didn't satisfy 
                    # any rule of this turn hasn't made it
                    # to current and will be forgotten...
    return current

您的数据样本 运行:

possible_combs(3, [[(1,2), (1,3), (2,2)],
                    [(2,2), (1,2), (2,4)],
                    [(1,2), (4,3), (2,2)]])

输出:

[[2, 2, 2], [1, 2, 2]]

请注意,它没有对每对索引的规则数量进行假设。

问题可以用图表来描述,其中节点按列组织:

一个节点由它所在的列及其具有的值唯一标识。

可以通过获取前两列之间的可能边,然后将这些边尽可能扩展到大小为 2 的路径,涉及相关列的节点,然后再次扩展到大小 3,包括下一栏,...等等。每次将节点添加到路径时,都必须验证路径中的所有先前节点是否都连接到该新节点。

为了有效地做到这一点,我建议使用邻接列表类型的数据结构,或者实际上是邻接 set,这样您就可以快速获取哪些节点可以到达来自另一列中给定节点的给定列。这些邻居集可以相交,以便留下满足所有约束的连接。

我会将输入约束定义为字典,因此对于给定的对列表,ij 是什么(列)毫无疑问。所以示例输入将是这个字典:

{
    (0, 1): [(1,2), (1,3), (2,2)],
    (0, 2): [(2,2), (1,2), (2,4)],
    (1, 2): [(1,2), (4,3), (2,2)]
}

代码:

from collections import defaultdict

def solve(constraints):
    # n is the size of each output tuple 
    n = max(b for _, b in constraints) + 1

    # convert contraints to adjacency sets
    graph = {}
    for key, pairs in constraints.items():
        dct = defaultdict(set)
        for a, b in pairs:
            dct[a].add(b)
        graph[key] = dct

    paths = constraints[(0, 1)]
    for j in range(2, n):
        newpaths = []
        for path in paths:
            additions = graph[(0, j)][path[0]]
            for i in range(1, len(path)):
                additions &= graph[(i, j)][path[i]]
                if not additions:  # quick exit
                    break
            newpaths.extend((*path, num) for num in additions)
        paths = newpaths

    return paths

这样调用:

constraints = {
    (0, 1): [(1,2), (1,3), (2,2)],
    (0, 2): [(2,2), (1,2), (2,4)],
    (1, 2): [(1,2), (4,3), (2,2)]
}
result = solve(constraints)
print(result)

输出:

[(1, 2, 2), (2, 2, 2)]

您可以在跟踪先前包含的与输入中的元组对关联的值时使用递归。这样,可以更有效地预先生成组合,而不必事后过滤整组组合:

from collections import defaultdict
d, d1 = {(0, 1): [(1, 2), (1, 3), (2, 2)], (0, 2): [(2, 2), (1, 2), (2, 4)], (1, 2): [(1, 2), (4, 3), (2, 2)]}, defaultdict(list)
for (a, b), j in d.items():
   d1[a].extend((l:=list(zip(*j)))[0])
   d1[b].extend(l[1])

def combos(p, c = {}):
   if not p:
      yield [*c.values()]
   else:
      for i in set(d1[p[0]]):
         if len(c) < 1 or all((b, i) in d[(a, p[0])] for a, b in c.items()):
            yield from combos(p[1:], {**c, p[0]:i})

print(list(combos([*{i for k in d for i in k}])))

输出:

[[1, 2, 2], [2, 2, 2]]