如何使 itertools 组合 'increase' 均匀?
How to make itertools combinations 'increase' evenly?
考虑以下示例:
import itertools
import numpy as np
a = np.arange(0,5)
b = np.arange(0,3)
c = np.arange(0,7)
prods = itertools.product(a,b,c)
for p in prods:
print(p)
这将按以下顺序迭代产品:
(0, 0, 0)
(0, 0, 1)
(0, 0, 2)
(0, 0, 3)
(0, 0, 4)
(0, 1, 0)
但我更希望产品按 总和 的顺序给出,例如
(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(1, 0, 0)
(0, 1, 1)
(1, 0, 1)
(1, 1, 0)
(0, 0, 2)
如何在不将所有组合存储在内存中的情况下实现此目的?
注意: a
b
和 c
始终是范围,但不一定具有相同的最大值。当两个乘积之和相等时,也没有二级排序,即 (0,1,1)
等同于 (2,0,0)
.
如果步骤始终为 1 并且避免存储所有组合是您的首要任务,您可以执行以下操作(部分使用 itertools.product):
import itertools
import numpy as np
def weak_compositions(boxes, balls, parent=tuple()):
"""
if boxes > 1:
for i in range(balls + 1):
for x in weak_compositions(boxes - 1, i, parent + (balls - i,)):
yield x
else:
yield parent + (balls,)
def verify_limits(x, minimum, maximum):
all_max = all(xi <= li for xi, li in zip(x, maximum))
all_min = all(xi >= li for xi, li in zip(x, minimum))
return all_max and all_min
def iterate_in_sum(ranges):
prods = itertools.product(*ranges)
# find number of different sums
unique_total_sums = sorted(set(sum(p) for p in prods))
# find the minimum limits
minimum_allowed = [min(r) for r in ranges]
# find the maximum limits
maximum_allowed = [max(r) for r in ranges]
for total_sum in unique_total_sums:
# decompose each sum into its summands
for x in weak_compositions(len(ranges), total_sum):
# if the decomposition meets the limits
if verify_limits(x, minimum_allowed, maximum_allowed):
yield x
a = np.arange(0, 5)
b = np.arange(0, 3)
c = np.arange(0, 7)
for s in iterate_in_sum([a, b, c]):
print(s, sum(s))
输出 (部分)
(0, 0, 0) 0
(1, 0, 0) 1
(0, 1, 0) 1
(0, 0, 1) 1
(2, 0, 0) 2
(1, 1, 0) 2
(1, 0, 1) 2
(0, 2, 0) 2
(0, 1, 1) 2
(0, 0, 2) 2
(3, 0, 0) 3
(2, 1, 0) 3
(2, 0, 1) 3
(1, 2, 0) 3
(1, 1, 1) 3
(1, 0, 2) 3
(0, 2, 1) 3
(0, 1, 2) 3
解决方案的核心是 weak_compositions
函数,它将一个数字分解成它的被加数(类似于 integer partition). More solutions to the above problem of composition of n into k parts can be found here.
注:
解决方案可以扩展到具有额外复杂性成本的非统一步骤。
在不在内存中存储额外产品的情况下执行此操作的最简单方法是使用递归。而不是 range(a,b)
,传入一个 (a,b)
对的列表并自己进行迭代:
def prod_by_sum(range_bounds: List[Tuple[int, int]]):
"""
Yield from the Cartesian product of input ranges, produced in order of sum.
>>> range_bounds = [(2, 4), (3, 6), (0, 2)]
>>> for prod in prod_by_sum(range_bounds):
... print(prod)
(2, 3, 0)
(2, 3, 1)
(2, 4, 0)
(3, 3, 0)
(2, 4, 1)
(2, 5, 0)
(3, 3, 1)
(3, 4, 0)
(2, 5, 1)
(3, 4, 1)
(3, 5, 0)
(3, 5, 1)
"""
def prod_by_sum_helper(start: int, goal_sum: int):
low, high = range_bounds[start]
if start == len(range_bounds) - 1:
if low <= goal_sum < high:
yield (goal_sum,)
return
for current in range(low, min(high, goal_sum + 1)):
yield from ((current,) + extra
for extra in prod_by_sum_helper(start + 1, goal_sum - current))
lowest_sum = sum(lo for lo, hi in range_bounds)
highest_sum = sum(hi - 1 for lo, hi in range_bounds)
for goal_sum in range(lowest_sum, highest_sum + 1):
yield from prod_by_sum_helper(0, goal_sum)
输出为
range_bounds = [(0, 5), (0, 3), (0, 7)]
开头为:
(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(1, 0, 0)
(0, 0, 2)
(0, 1, 1)
(0, 2, 0)
(1, 0, 1)
(1, 1, 0)
(2, 0, 0)
您可以通过修改单个列表并生成它的副本来迭代执行此确切过程,但代码要么变得更复杂,要么效率更低。
您也可以简单地修改它以支持 1 以外的步骤,但是随着步骤越来越大,这样做的效率会降低,因为最后一个范围可能不包含生成当前总和所需的元素。这似乎是不可避免的,因为到那时你需要解决一个困难的计算问题才能有效地按总和循环这些产品。
考虑以下示例:
import itertools
import numpy as np
a = np.arange(0,5)
b = np.arange(0,3)
c = np.arange(0,7)
prods = itertools.product(a,b,c)
for p in prods:
print(p)
这将按以下顺序迭代产品:
(0, 0, 0)
(0, 0, 1)
(0, 0, 2)
(0, 0, 3)
(0, 0, 4)
(0, 1, 0)
但我更希望产品按 总和 的顺序给出,例如
(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(1, 0, 0)
(0, 1, 1)
(1, 0, 1)
(1, 1, 0)
(0, 0, 2)
如何在不将所有组合存储在内存中的情况下实现此目的?
注意: a
b
和 c
始终是范围,但不一定具有相同的最大值。当两个乘积之和相等时,也没有二级排序,即 (0,1,1)
等同于 (2,0,0)
.
如果步骤始终为 1 并且避免存储所有组合是您的首要任务,您可以执行以下操作(部分使用 itertools.product):
import itertools
import numpy as np
def weak_compositions(boxes, balls, parent=tuple()):
"""
if boxes > 1:
for i in range(balls + 1):
for x in weak_compositions(boxes - 1, i, parent + (balls - i,)):
yield x
else:
yield parent + (balls,)
def verify_limits(x, minimum, maximum):
all_max = all(xi <= li for xi, li in zip(x, maximum))
all_min = all(xi >= li for xi, li in zip(x, minimum))
return all_max and all_min
def iterate_in_sum(ranges):
prods = itertools.product(*ranges)
# find number of different sums
unique_total_sums = sorted(set(sum(p) for p in prods))
# find the minimum limits
minimum_allowed = [min(r) for r in ranges]
# find the maximum limits
maximum_allowed = [max(r) for r in ranges]
for total_sum in unique_total_sums:
# decompose each sum into its summands
for x in weak_compositions(len(ranges), total_sum):
# if the decomposition meets the limits
if verify_limits(x, minimum_allowed, maximum_allowed):
yield x
a = np.arange(0, 5)
b = np.arange(0, 3)
c = np.arange(0, 7)
for s in iterate_in_sum([a, b, c]):
print(s, sum(s))
输出 (部分)
(0, 0, 0) 0
(1, 0, 0) 1
(0, 1, 0) 1
(0, 0, 1) 1
(2, 0, 0) 2
(1, 1, 0) 2
(1, 0, 1) 2
(0, 2, 0) 2
(0, 1, 1) 2
(0, 0, 2) 2
(3, 0, 0) 3
(2, 1, 0) 3
(2, 0, 1) 3
(1, 2, 0) 3
(1, 1, 1) 3
(1, 0, 2) 3
(0, 2, 1) 3
(0, 1, 2) 3
解决方案的核心是 weak_compositions
函数,它将一个数字分解成它的被加数(类似于 integer partition). More solutions to the above problem of composition of n into k parts can be found here.
注:
解决方案可以扩展到具有额外复杂性成本的非统一步骤。
在不在内存中存储额外产品的情况下执行此操作的最简单方法是使用递归。而不是 range(a,b)
,传入一个 (a,b)
对的列表并自己进行迭代:
def prod_by_sum(range_bounds: List[Tuple[int, int]]):
"""
Yield from the Cartesian product of input ranges, produced in order of sum.
>>> range_bounds = [(2, 4), (3, 6), (0, 2)]
>>> for prod in prod_by_sum(range_bounds):
... print(prod)
(2, 3, 0)
(2, 3, 1)
(2, 4, 0)
(3, 3, 0)
(2, 4, 1)
(2, 5, 0)
(3, 3, 1)
(3, 4, 0)
(2, 5, 1)
(3, 4, 1)
(3, 5, 0)
(3, 5, 1)
"""
def prod_by_sum_helper(start: int, goal_sum: int):
low, high = range_bounds[start]
if start == len(range_bounds) - 1:
if low <= goal_sum < high:
yield (goal_sum,)
return
for current in range(low, min(high, goal_sum + 1)):
yield from ((current,) + extra
for extra in prod_by_sum_helper(start + 1, goal_sum - current))
lowest_sum = sum(lo for lo, hi in range_bounds)
highest_sum = sum(hi - 1 for lo, hi in range_bounds)
for goal_sum in range(lowest_sum, highest_sum + 1):
yield from prod_by_sum_helper(0, goal_sum)
输出为
range_bounds = [(0, 5), (0, 3), (0, 7)]
开头为:
(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(1, 0, 0)
(0, 0, 2)
(0, 1, 1)
(0, 2, 0)
(1, 0, 1)
(1, 1, 0)
(2, 0, 0)
您可以通过修改单个列表并生成它的副本来迭代执行此确切过程,但代码要么变得更复杂,要么效率更低。
您也可以简单地修改它以支持 1 以外的步骤,但是随着步骤越来越大,这样做的效率会降低,因为最后一个范围可能不包含生成当前总和所需的元素。这似乎是不可避免的,因为到那时你需要解决一个困难的计算问题才能有效地按总和循环这些产品。