生成没有重复的每个组合的列表

generating list of every combination without duplicates

我想生成一个组合列表。我会尽量简化我的问题,使其易于理解。

我们有 3 个变量:

我想使用 python 生成所有可能组合的列表,没有任何重复的知道:我不关心组的顺序和组内字母的顺序。

例如,x = 4, k = 2, n = 2 :

# we start with 4 letters, we want to make 2 groups of 2 letters
letters = ['A','B','C','D']

# here would be a code that generate the list

# Here is the result that is very simple, only 3 combinations exist.
combos = [ ['AB', 'CD'], ['AC', 'BD'], ['AD', 'BC'] ]

由于我不关心组内或组内的顺序,以及组内的字母,['AB', 'CD']['DC', 'BA'] 是重复的。

这是对我的实际问题的简化,它具有以下值:x = 12k = 4n = 3。我尝试使用 itertools 中的一些函数,但是由于组合太多,我的电脑死机了。

看待问题的另一种方式:你有 12 名球员,你想组成 4 支球队,每队 3 名球员。所有的可能性是什么?

谁能帮我找到生成此列表的优化解决方案?

首先,您可以使用列表推导式为您提供所有可能的组合(无论是否重复):

comb = [(a,b) for a in letters for b in letters if a != b]

然后,您可以使用 sorted 函数对元组进行排序。之后,要删除重复项,您可以将所有项目转换为一个集合,然后再转换回列表。

var = [tuple(sorted(sub)) for sub in comb]
var = list(set(var))

使用 itertools 的组合

from itertools import combinations 

x = list(combinations(['A','B','C','D'],2))

t = []
for i in (x):
    t.append(i[0]+i[1]) # concatenating the strings and adding in a list

g = []
for i in range(0,len(t),2):
    for j in range(i+1,len(t)):
        g.append([t[i],t[j]])
        break
print(g)

您可以使用列表理解方法,它的时间复杂度为 O(n*n-1),或者您可以使用更冗长的方法,但时间复杂度为 O(n^2) -n)/2:

comb = []

for first_letter_idx, _ in enumerate(letters):
    for sec_letter_idx in range(first_letter_idx + 1, len(letters)):
        comb.append(letters[first_letter_idx] + letters[sec_letter_idx])

print(comb)

comb2 = []

for first_letter_idx, _ in enumerate(comb):
    for sec_letter_idx in range(first_letter_idx + 1, len(comb)):
        if (comb[first_letter_idx][0] not in comb[sec_letter_idx]
        and comb[first_letter_idx][1] not in comb[sec_letter_idx]):
            comb2.append([comb[first_letter_idx], comb[sec_letter_idx]])

print(comb2)

此算法需要更多工作来处理动态输入。也许用递归。

肯定会有更多 sophisticated/efficient 方法来做到这一点,但这里有一种方法可以在合理的时间内为您的示例工作,并且应该足够容易以适应其他情况。

它会根据您的规格生成独特的团队及其独特的组合。

from itertools import combinations

# this assumes that team_size * team_num == len(players) is a given
team_size = 3
team_num = 4
players = list('ABCDEFGHIJKL')
unique_teams = [set(c) for c in combinations(players, team_size)]

def duplicate_player(combo):
    """Returns True if a player occurs in more than one team"""
    return len(set.union(*combo)) < len(players)
    
result = (combo for combo in combinations(unique_teams, team_num) if not duplicate_player(combo))

result 是一个生成器,可以用 list(result) 迭代或变成列表。在 kaggle.com 上,生成所有可能组合的完整列表需要一分钟左右的时间(总共 15400,与评论中@beaker 和 @John Coleman 的计算一致)。团队是看起来像这样的集合的元组:

[({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'I'}, {'J', 'K', 'L'}),
 ({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'J'}, {'I', 'K', 'L'}),
 ({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'K'}, {'I', 'J', 'L'}),
 ...
]

如果需要,您可以通过对它们中的每一个调用 ''.join() 将它们转换为字符串。

另一个解决方案(玩家编号为 0、1、...):

import itertools

def equipartitions(base_count: int, group_size: int):
    if base_count % group_size != 0:
        raise ValueError("group_count must divide base_count")

    return set(_equipartitions(frozenset(range(base_count)), group_size))


def _equipartitions(base_set: frozenset, group_size: int):
    if not base_set:
        yield frozenset()

    for combo in itertools.combinations(base_set, group_size):
        for rest in _equipartitions(base_set.difference(frozenset(combo)), group_size):
            yield frozenset({frozenset(combo), *rest})


all_combinations = [
    [tuple(team) for team in combo]
        for combo in equipartitions(12, 3)
]

print(all_combinations)
print(len(all_combinations))

还有一个:

import itertools
from typing import Iterable

def equipartitions(players: Iterable, team_size: int):
    if len(players) % team_size != 0:
        raise ValueError("group_count must divide base_count")

    return _equipartitions(set(players), team_size)


def _equipartitions(players: set, team_size: int):
    if not players:
        yield []
        return

    first_player, *other_players = players
    for other_team_members in itertools.combinations(other_players, team_size-1):
        first_team = {first_player, *other_team_members}
        for other_teams in _equipartitions(set(other_players) - set(first_team), team_size):
            yield [first_team, *other_teams]


all_combinations = [
    {''.join(sorted(team)) for team in combo} for combo in equipartitions(players='ABCDEFGHIJKL', team_size=3)
]


print(all_combinations)
print(len(all_combinations))