优化排列生成器，其中每个排列的总和为相同的值

Question

我想创建一个排列或笛卡尔乘积列表（不确定此处适用哪个），其中每个排列中的值总和等于提供的值。

函数需要三个参数。

样本大小：每个排列中的项目数
期望总和：每个排列的总和应达到
数字集：排列中可以包含重复的数字集

我有一个在下面工作的实现，但它看起来很慢我更愿意使用迭代器来流式传输结果，但我还需要一个能够计算迭代器将产生的项目总数的函数.

def buildPerms(sample_size, desired_sum, set_of_number):
    blank = [0] * sample_size
    return recurseBuildPerms([], blank, set_of_number, desired_sum)

def recurseBuildPerms(perms, blank, values, desired_size, search_index = 0):
    for i in range(0, len(values)):
        for j in range(search_index, len(blank)):
            if(blank[j] == 0):
                new_blank = blank.copy()
                new_blank[j] = values[i]
                remainder = desired_size - sum(new_blank)
                new_values = list(filter(lambda x: x <=  remainder, values))
                if(len(new_values) > 0):
                    recurseBuildPerms(perms, new_blank, new_values, desired_size, j)
                elif(sum(new_blank) <= desired_size):
                    perms.append( new_blank)
    return perms

perms = buildPerms(4, 10, [1,2,3])
print(perms)
## Output
[[1, 3, 3, 3], [2, 2, 3, 3], [2, 3, 2, 3], 
 [2, 3, 3, 2], [3, 1, 3, 3], [3, 2, 2, 3], 
 [3, 2, 3, 2], [3, 3, 1, 3], [3, 3, 2, 2], 
 [3, 3, 3, 1]]

https://www.online-python.com/9cmOev3zlg

问题：

谁能帮我把我的解决方案转换成迭代器？
是否可以在不查看完整列表的情况下计算出项目总数？

Answer 1

这是将其分解为两个子问题的一种方法：

找到 target_sum 的所有限制整数分区到 sample_size 加数 s.t。所有加数都来自 set_of_number.
为每个分区计算多集排列（占用大部分时间）。

问题1可以用动态规划来解决。我在第 2 部分中使用了 sympy 中的 multiset_permutations，尽管您可以通过编写自己的 numba 代码来获得更好的性能。

代码如下：

from functools import lru_cache
from sympy.utilities.iterables import multiset_permutations

@lru_cache(None)
def restricted_partitions(n, k, *xs):
  'partitions of n into k summands using only elements in xs (assumed positive integers)'
  if n == k == 0:
    # case of unique empty partition
    return [[]]
  elif n <= 0 or k <= 0 or not xs:
    # case where no partition is possible
    return []
  # general case
  result = list()
  x = xs[0] # element x we consider including in a partition
  i = 0     # number of times x should be included
  while True:
    i += 1
    if i > k or x * i > n:
      break
    for rest in restricted_partitions(n - x * i, k - i, *xs[1:]):
      result.append([x] * i + rest)
  result.extend(restricted_partitions(n, k, *xs[1:]))

  return result
    
def buildPerms2(sample_size, desired_sum, set_of_number):
  for part in restricted_partitions(desired_sum, sample_size, *set_of_number):
    yield from multiset_permutations(part)


# %timeit sum(1 for _ in buildPerms2(8, 16, [1, 2, 3, 4])) # 16 ms
# %timeit sum(1 for _ in buildPerms (8, 16, [1, 2, 3, 4])) # 604 ms

当前的解决方案需要在迭代开始之前计算所有受限分区，但如果可以快速计算受限分区，它可能仍然实用。迭代计算分区也是可能的，尽管这可能需要更多的工作。

关于第二个问题，你确实可以计算出这样的排列的数量，而无需全部生成它们：

# present in the builtin math library for Python 3.8+
@lru_cache(None)
def binomial(n, k):
  if k == 0:
    return 1
  if n == 0:
    return 0
  return binomial(n - 1, k) + binomial(n - 1, k - 1)


@lru_cache(None)
def perm_counts(n, k, *xs):
  if n == k == 0:
    # case of unique empty partition
    return 1
  elif n <= 0 or k <= 0 or not xs:
    # case where no partition is possible
    return 0
  # general case
  result = 0
  x = xs[0] # element x we consider including in a partition
  i = 0     # number of times x should be included
  while True:
    i += 1
    if i > k or x * i > n:
      break
    result += binomial(k, i) * perm_counts(n - x * i, k - i, *xs[1:])

  result += perm_counts(n, k, *xs[1:])

  return result

# assert perm_counts(15, 6, *[1,2,3,4]) == sum(1 for _ in buildPerms2(6, 15, [1,2,3,4])) == 580
# perm_counts(1000, 100, *[1,2,4,8,16,32,64])
# 902366143258890463230784240045750280765827746908124462169947051257879292738672

用于计算所有受限排列的函数看起来与上面生成分区的函数非常相似。唯一显着的变化是在以下行中：

result += binomial(k, i) * perm_counts(n - x * i, k - i, *xs[1:])

有 i 个 x 的副本要包含，k 个可能的位置 x 可能会结束。考虑到这种多样性，解决递归的方法数 sub-problem 乘以 k 选择 i。

优化排列生成器，其中每个排列的总和为相同的值

Optimizing permutation generator where total of each permutation totals to same value

python

algorithm

optimization

permutation