执行笛卡尔积时如何减少内存消耗?

How to reduce memory consumption when performing a cartesian product?

给定一个二维矩阵,例如 [[a,b,c],[d,e,f]...]],我想计算矩阵的笛卡尔积,以便确定所有可能的组合。

对于这个特定的约束,当我使用具有 12 个不同子集的二维矩阵时,它使用的内存超过了我分配的 16 兆字节内存。每个子集中有三个值,所以我会有 312 个不同的组合。

我使用的笛卡尔积函数是:

def cartesian_iterative(pools):
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    return result

我想知道如何在不使用任何外部库的情况下减少内存消耗。我将使用的示例二维数组是 [['G', 'H', 'I'], ['M', 'N', 'O'], ['D', 'E', 'F'], ['D', 'E', 'F'], ['P', 'R', 'S'], ['D', 'E', 'F'], ['M', 'N', 'O'], ['D', 'E', 'F'], ['D', 'E', 'F'], ['M', 'N', 'O'], ['A', 'B', 'C'], ['D', 'E', 'F']]

编辑: 作为参考,可以在此处 Problem Statement. Here is the link to the file of possible names Acceptable Names 找到问题陈述的 link。

最终代码:

with open('namenum.in','r') as fin:
    num = str(fin.readline().strip()) #the number being used to determine all combinations

numCount = []
for i in range(len(num)):
    numCount.append(dicti[num[i]]) #creates a 2d array where each number in the initial 'num' has a group of three letters


def cartesian_iterative(pools): #returns the product of a 2d array
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    return result

pos = set() #set of possible names
if len(num) == 12: #only uses more than the allocated memory when the num is 12 digits long.
    '''
    This optimization allows the product to only calculate 2 * 3^6 values, instead of 3**12. This saves a lot of memory
    '''
    rights = cartesian_iterative(numCount[6:])
    for left in cartesian_iterative(numCount[:6]):
        for right in rights:
            a = ''.join(left+right)
            if a in names:
                pos.add(a) #adding name to set
else: #if len(num) < 12, you do not need any other optimizations and can just return normal product 
    for i in cartesian_iterative(numCount):
        a = ''.join(i)
        if a in names:
            pos.add(a)
pos = sorted(pos)


with open('namenum.out','w') as fout: #outputting all possible names
    if len(pos) > 0:
        for i in pos:
            fout.write(i)
            fout.write('\n')
    else:
        fout.write('NONE\n')

您可以分别在左半边和右半边使用该功能。那么你只有 2×36 个组合而不是 312。而且它们只有一半长,甚至可以抵消因子 2。

for left in cartesian_iterative(pools[:6]):
    for right in cartesian_iterative(pools[6:]):
        print(left + right)

输出:

['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'D']
['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'E']
['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'F']
['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'B', 'D']
...

为了更快,只计算一次正确的组合:

rights = cartesian_iterative(pools[6:])
for left in cartesian_iterative(pools[:6]):
    for right in rights:
        print(left + right)