为特定的精确覆盖问题定义约束

Define contraints for specific exact cover problem

我有一个具体的问题,我想使用 Knuth's Algorithm X. However, I struggle to translate my problem into suitable contraints, that make up the incidence matrix 来解决算法 X 的运算。

对于我的体育俱乐部夏季锦标赛,我想制定一个将四名球员分组在一起的赛程表,而无需在随后的比赛轮次中重新分组任何一对球员。

我认为这可以很好地翻译成这样的 exact cover problem

为 20 个玩家设置后,我得到了一个包含 20 列和 4845 行(每个 player/column 969 行)的关联矩阵。

算法 X 会很好地找到解决方案,但这只会涵盖一个(第一)轮。让算法继续下去,会为same轮吐出更多备选方案,这不是我感兴趣的。 所以我围绕算法构建了一个迭代器,它将采用解决方案并根据玩家重叠从关联矩阵中删除行:每当解决方案中的一组与矩阵的任何行至少有 2 的交集时,该行将被删除.在算法的第一个 运行 之后,矩阵被剔除到 1280 行。 运行 算法 X 将找到下一个解决方案,依此类推,直到找不到为止。

长话短说,这种方法不是精确覆盖问题的正确应用 - 我必须反复寻找部分解决方案。正确的精确掩护问题应该包括以某种方式进行的回合顺序。为什么?因为现在我没有探索所有可能的解决方案! 20 名玩家就是最好的例子。算法 X 只会找到连续 3 轮的解决方案。 Yet, I do know that there are at least 5, when different intermediate solutions are chosen.这正是我希望算法 X 可以为我解决的工作。使用上述方法,玩轮之间没有回溯

尽管这个问题足够抽象以至于不需要代码,但这是我在 Python 中对 Knuth 的 DLX(算法 X 和 Dancing Links)的实现:

from itertools import combinations
def dancing_links (players):
    """
    Implement the dancing links algorithm as described by Donald Knuth, which
    attemts to solve an exact cover problem exhaustively in an efficient way.
    Adapted for my use case, I define the incidence matrix as follows:
        * Columns are players.
        * Rows are groups of players.
        * The intersection of groups and players mean that that player is a
          member of that group.
        * One group contains exactly four players, i.e. each row has
          exactly four 1s.
        * Repeatedly solve the exact cover problem for a reduced set of groups,
          where each round the total set of groups is filtered for illegal
          groups. An illegal group features at least two players that
          have already played together in a round.
    """

    class FoundSolution (Exception):
        "Use the exception to abort recursive stacks"
        pass

    # Dancing links is based on "doubly linked lists" intersecting
    # with each other. Python doesn't have this kind of data structure
    # and implementing it is quite expensive. E.g. each field of the incidence
    # matrix could be a Node which has pointers in all four directions,
    # The Node class with 6 attributes (four pointers, a name and arbitrary
    # data) needs to undergo countless name lookups, which is a slow process
    # in Python. So instead, I represent each node without a class definition
    # as a dict.
    # 
    # Since we're walking over so many doubly linked lists, starting from
    # any of its nodes, we need to remember where we started and iterate
    # through all of them. That clutters our code later on a lot without
    # this iterator function.
    def iter_dll (start, direction='right'):
        next = start[direction]
        # Need to explicitly compare object ids. Otherwise Python
        # would try to do a deep comparison of two dicts. which is impossible
        # due to the circular referencing.
        while id(start) != id(next):
            yield next
            next = next[direction]

    def cover (column):
        """
        Cover a column by removing its head node from the control row and
        removing each of its rows from other columns that intersect.
        """
        column['left']['right'] = column['right']
        column['right']['left'] = column['left']
        for r in iter_dll(column, 'down'):
            for c in iter_dll(r):
                c['up']['down'] = c['down']
                c['down']['up'] = c['up']

    def uncover (column):
        # Undo the changes caused by a call to cover(dll) by injecting the
        # linked nodes with the remembered predecessor and successor.
        for r in iter_dll(column, 'up'):
            for c in iter_dll(r, 'left'):
                c['up']['down'] = c['down']['up'] = c
        else:
            column['left']['right'] = column['right']['left'] = column

    def search (i, root):
        if id(root['right']) == id(root):
            # The only way to exit the complete recursion stack is an exception.
            raise FoundSolution
        for c in iter_dll(root):
            cover(c)
            for r in iter_dll(c, 'down'):
                lineup.append(r)
                for j in iter_dll(r):
                    cover(j['data'])
                search(i+1, root)
                lineup.pop()
                for j in iter_dll(r, 'left'):
                    uncover(j['data'])
            else:
                uncover(c)

    def generate_incidence_matrix (groups):
        # The gateway to our web of doubly linked lists.
        root = {'name': 'root', 'data': None}
        # Close the circle in left and right dirctions, so we can keep the
        # circle closed while injecting new nodes.
        root['right'] = root['left'] = root
        # The control row of column headers is created by attaching each new
        # Header with the previous one.
        for player in players:
            n = {
                'name': 'Headnode {}'.format(player),
                'data': player,
                'right': root,
                'left': root['left'],
            }
            n['up'] = n['down'] = n
            root['left']['right'] = root['left'] = n

        # Now append nodes to each column header node in our control row -
        # one for each player of a distinct group of four players.
        rnmbr = 0
        # Seed for new row nodes
        seed = {'name': 'seed', 'data': None}
        for g in groups:
            rnmbr += 1
            seed['right'] = seed['left'] = seed
            # Iterate through the control nodes for each of the four players.
            for header in (m for m in iter_dll(root) for p in g if m['data'] == p):
                n = {
                    # Chose a name that identifies the row and colum for this
                    # new node properly.
                    'name': 'R-{},C-{}'.format(rnmbr, header['data']),
                    'data': header,
                    'up': header['up'],
                    'down': header,
                    'left': seed,
                    'right': seed['right']
                }
                header['up']['down'] = header['up'] = n
                seed['right']['left'] = seed['right'] = n
            else:
                # Extract the seed from this row
                seed['right']['left'] = seed['left']
                seed['left']['right'] = seed['right']

        return root

    groups = tuple(combinations(players, 4))    
    groups_per_round = len(players)/4
    lineups = []

    while len(groups) >= groups_per_round:
        root = generate_incidence_matrix(groups)
        lineup = []
        try:
            search(0, root)
        except FoundSolution:
            lineup = reduce(list.__add__, ([r['data']['data']] + [n['data']['data'] for n in iter_dll(r)] for r in lineup))
            lineup = tuple(tuple(sorted(lineup[i:i + 4])) for i in xrange(0, len(lineup), 4))
            lineups.append(lineup)
            groups = tuple(group for group in groups if all(len(g.intersection(set(group))) < 2 for g in (set(s) for s in lineup))) 
        else:
            break

    return lineups

给定玩家列表,此函数将在屏幕上打印中间解决方案,直到选项用尽。遗憾的是,它没有我希望的那么快,但对我来说这是一个很好的编程练习。 :-)

调用上面定义的 dancing_links() 函数将产生以下输出...

>>> pprint.pprint(dancing_links(range(1,21)))
[((1, 2, 3, 4), (5, 6, 7, 8), (9, 10, 11, 12), (13, 14, 15, 16), (17, 18, 19, 20)),
 ((1, 5, 9, 13), (2, 6, 10, 17), (3, 7, 14, 18), (4, 11, 15, 19), (8, 12, 16, 20)),
 ((1, 6, 11, 14), (2, 5, 12, 18), (3, 8, 13, 19), (4, 9, 16, 17), (7, 10, 15, 20))]

我的预期更像是...

[((1, 2, 3, 4), (5, 6, 7, 8), (9, 10, 11, 12), (13, 14, 15, 16), (17, 18, 19, 20)),
 ((1, 5, 9, 13), (2, 6, 10, 17), (3, 7, 14, 18), (4, 11, 15, 19), (8, 12, 16, 20)),
 ((1, 12, 15, 18), (2, 5, 16, 19), (3, 6, 9, 20), (4, 7, 10, 13), (8, 11, 14, 17)),
 ((1, 7, 11, 20), (2, 8, 13, 18), (3, 5, 10, 15), (4, 9, 16, 17), (6, 12, 14, 19)),
 ((1, 8, 10, 19), (2, 7, 9, 15), (3, 12, 13, 17), (4, 5, 14, 20), (6, 11, 16, 18))]

请注意,它不一定是这个精确的解决方案。这只是我在尝试最终为任意数量的玩家生成时间表时发现的示例解决方案。

(完全重写答案。)

尽管 可能 可以将此问题转换为精确覆盖问题并(原则上)使用算法 X 解决它,但在实践中证明这是不切实际的,并且是更好的方法。


先回答你的问题:回想一下精确覆盖问题:给定一堆“项目”(要覆盖)和一堆“选项”(每个选项覆盖一组项目),问题是找到一组选项,使得每个项目都恰好被覆盖一次。

在您的情况下,如果您选择 (20) 名玩家作为项目并选择四人一组的玩家作为选项,那么,正如您发现的那样,任何针对精确覆盖问题的算法都会找到调度方法 一轮 锦标赛。

但实际上你根本不需要它,因为(在没有进一步限制的情况下)你可以明确地写下所有解决方案:有(20 选 4)=4845 种选择第一组的方法- of-4,然后 (16 choose 4)=1820 种从剩余的 group-of-4 中选择另一组 4 的方法,等等,最后你不关心你找到的五个 group-of-4 之间的顺序,所以你可以将一组 20 人分成五组,每组 4 种方法的数量是

(选择(20, 4) * 选择(16, 4) * 选择(12, 4) * 选择(8, 4) * 选择(4, 4)) / (5 * 4 * 3 * 2 * 1) = 2546168625.

(等价于:(20!)/((4!)^5 * 5!) = 2546168625,因为我们可以写下 20 名玩家的列表,然后在每组 4 人中重新排序,并重新排序这些组.)

如果你想用一个程序生成所有这些,你可以按规范顺序编写它们中的每一个:假设你调用 20 个玩家 'A' 到 'T',那么你可以按字典顺序写下每组 4(因此,组 {F, J, B, Q} 将写为“BFJQ”),最后你可以按字典顺序写下这五个组本身(因此第一组将从 "A" 开始,第二组最早的字母不在第一组中,依此类推。

接下来,如果您想覆盖多轮,再次将其定义为类似精确覆盖的问题,您需要具有上述数量(≈25 亿)的“选项”(行)。目前尚不清楚这些项目是什么,但这显然是不切实际的,因此不值得追求这种思路。


反倒是你的问题研究的很好,原来是在Kirkman's schoolgirl problem (scheduling, from 15 people, groups of 3 as many times as possible — turns out to be 7) (see Ray Toal's page here), and more recently as the “social golfer problem” (see Markus Triska's page here这个名字下的):

The Social Golfer Problem (SGP) is a combinatorial optimisation problem. The task is to schedule g × p golfers in g groups of p players for w weeks such that no two golfers play in the same group more than once. An instance of the SGP is denoted by the triple g-p-w. The original problem asks for the maximal w such that the instance 8-4-w can be solved.

而你的问题在这里要求最大 w 这样实例 5-4-w 可以被解决。答案原来是 w=5,问题上的 MathWorld page 恰好给出了这个 5-4-5 实例:

Mon ABCD    EFGH    IJKL    MNOP    QRST
Tue AEIM    BJOQ    CHNT    DGLS    FKPR
Wed AGKO    BIPT    CFMS    DHJR    ELNQ
Thu AHLP    BKNS    CEOR    DFIQ    GJMT
Fri AFJN    BLMR    CGPQ    DEKT    HIOS

另请参阅 Warwick Harvey's page,其中有许多不同参数的最知名解决方案。它记录了 5 是这个问题的上限和下限,即已知如何安排 5 轮(你自己找到了!)并且知道不可能安排超过 5 轮。

您可以搜索“social golfer problem”的文献,找到更多计算机解决问题的方法。更一般地说,像这样的问题在被称为 Combinatorial Designs. One great reference I found while writing an 的广阔组合学领域中得到了研究,这本书是 组合设计手册 ,其中有像 VI.3 Balanced Tournament Designs 这样的章节,和 VI.51 安排比赛。