与 4 之和相切的组合的组合

Question

到目前为止我有 mylist = list(itertools.product(*a))

问题在于它生成了太多的元组。如果所有元组的总和> 4，我希望它不要生成元组。例如

[(0, 0, 0, 0),
 (0, 0, 0, 1),
 (0, 0, 0, 2),
 (0, 0, 1, 0),
 (0, 0, 1, 1),
 (0, 0, 1, 2),
 (0, 1, 0, 0),
 (0, 1, 0, 1),
 (0, 1, 0, 2),
 (0, 1, 1, 0),
 (0, 1, 1, 1),
 (0, 1, 1, 2),
 (1, 0, 0, 0),
 (1, 0, 0, 1),
 (1, 0, 0, 2),
 (1, 0, 1, 0),
 (1, 0, 1, 1),
 (1, 0, 1, 2),
 (1, 1, 0, 0),
 (1, 1, 0, 1),
 (1, 1, 0, 2),
 (1, 1, 1, 0),
 (1, 1, 1, 1),
 (1, 1, 1, 2)]

它不应该 (1, 1, 1, 2) 因为它总和 5；虽然在这个例子中它只是一个，但在其他例子中它会多得多。

Answer 1

如果你的数据集很大，你可以在这里使用 numpy。

numpy.indices 提供 an equivalent of itertools.product 你也可以高效过滤，

import numpy as np

arr = np.indices((4, 4, 4, 4)).reshape(4,-1).T
mask = arr.sum(axis=1) < 5
res = arr[mask]
print(res)

#[[0 0 0 0]
# [0 0 0 1]
# [0 0 0 2]
# [0 0 0 3]
# [0 0 1 0]
#  ... 
# [3 0 0 1]
# [3 0 1 0]
# [3 1 0 0]]

否则对于小型数据集，如评论中所述，itertools.ifilter 非常快，

from itertools import product, ifilter
gen = product((0,1,2,3), repeat=4)
res = ifilter(lambda x: sum(x) < 4, gen)
res = list(res) # converting to list only at the end

在这种特殊情况下，两种方法的性能相当。

如果您需要针对此特定情况获得更好的性能，您始终可以使用 C 或 Cython 编写优化的例程。

与 4 之和相切的组合的组合

Combination of combinations that cut of with the sum of 4

python

itertools

python-2.7