从字典和维度生成所有可能组合的功能性尾递归方式

Question

我想找出实现以下指定函数的简洁、实用和尾递归（如果可能）的方法：

(define (make-domain digits dimension)
    ;; Implementation)
;; Usage
(make-domain '(0 1) 0) => (())
(make-domain '(0 1) 1) => ((0) (1))
(make-domain '(0 1) 2) => ((0 0) (0 1) (1 0) (1 1))
(make-domain '(0 1) 3) => ((0 0 0) (0 0 1) (0 1 0) (0 1 1) (1 0 0) (1 0 1) (1 1 0) (1 1 1))

我更喜欢使用尽可能少的辅助函数或库函数实现 Scheme，但 SML 或 Haskell 也可以。我正在尝试找到可能使用相互或嵌套递归的尾递归解决方案，但目前运气不佳。

非常感谢！

Answer 1

Haskell 中的那个至少是“功能性的”和简洁的（我认为）：

makeDomain :: [α] -> Int -> [[α]]
makeDomain xs 0  =  [[]]
makeDomain xs n  =  let  mdn1 = makeDomain xs (n-1)
                         fn x = map (x:) mdn1
                    in   concat (map fn xs)

正在尝试：

 λ> 
 λ> makeDomain [0,1] 2
[[0,0],[0,1],[1,0],[1,1]]
 λ> 
 λ> makeDomain [0,1] 3
[[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
 λ>

如评论中所述，至少在 Haskell 中，尾递归可能不是一个好主意。

关于内存效率的附录：

您没有在需求中列出性能问题（是因为您认为尾递归函数往往性能更好吗？）。

makeDomain 的上述版本，正如 amalloy 的评论中所暗示的那样，内存消耗呈指数级增长，至少对于某些编译器版本/优化级别而言。这是因为编译器可以将 makeDomain xs (n-1) 视为要保留的循环不变值。

因此，在这种情况下，您必须在优雅和效率之间做出权衡。这个问题最近已在 in the context of the very similar replicateM library function; drawing on the answer by K. A. Buhr, one can come up with a version of makeDomain that runs in constant memory, leveraging the Haskell list comprehension 结构中讨论过。

makeDomain1 :: [α] -> Int -> [[α]]
makeDomain1 xs n =
    map reverse (helper xs n)
        where
            helper xs 0 = [[]]
            helper xs n = [ x:ys  |  ys <- helper xs (n-1),  x <- xs ]

测试： 运行 1200 MB 的 OS 强制内存硬限制。

 λ> 
 λ> import Control.Monad (replicateM)
 λ> replicateM 3 [0,1]
[[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
 λ> 
 λ> makeDomain1 [0,1] 3
[[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
 λ> 
 λ> length $ replicateM 30 [0,1]
<interactive>: internal error: Unable to commit 1048576 bytes of memory
...

 λ> 
 λ> length $ makeDomain [0,1] 30
<interactive>: internal error: Unable to commit 1048576 bytes of memory
...

 λ> 
 λ> length $ makeDomain1 [0,1] 30
1073741824
 λ>

使用带有 -O2 选项的 GHC v8.6.5，最后一个版本永远不会占用超过 150 MB 的内存，并且在 vanilla Intel x86-64 PC 上以每个输出列表大约 30 纳秒的速度运行。这是完全合理的。

Answer 2

这是我对解决上述问题的建设性看法。

解决方案在 Scheme 中是功能性的、简洁的、递归的（但不是尾递归的）实现。

这个想法是域有一个归纳（递归）定义：域（第一个映射）中的每个组合都是一对数字，从初始数字字典中一对一地取出，所有组合为缩小一维（第二张地图）

(define (make-domain digits dimension)
  "Builds all combinations of digits for a dimension"
  ;; There is an empty combination for a dimension 0
  (if [zero? dimension] '(())
      ;; Combine all combinations
      (apply append
             ;; For each digit from digits
             (map (lambda (d)
                    ;; Prepend the digit to each combination
                    ;; for a smaller by one dimension
                    (map (lambda (sd) (cons d sd))
                         (make-domain digits (1- dimension))))
                  digits))))

Answer 3

可以通过使用累加器的常用技巧实现尾递归。以下是 Racket 而不是 Scheme，但也许只是因为它使用了 append* 我认为可以定义为

(define (append* . args)
  (apply append args))

这里是尾递归版本，因此：

(define (make-domain digits dimension)
  (let mdl ([d dimension] [r '(())])
    (if (zero? d)
        r
        (mdl (- d 1)
             (append* (map (λ (d)
                             (map (λ (sd)
                                    (cons d sd))
                                  r))
                           digits))))))

Answer 4

为了完整起见，这里是翻译成标准 ML 的 Haskell 解决方案：

fun curry f x y = f (x, y)
fun concatMap f xs = List.concat (List.map f xs)

fun makeDomain _ 0 = [[]]
  | makeDomain ys n =
    let val res = makeDomain ys (n-1)
    in concatMap (fn x => map (curry op:: x) res) ys
    end

可以应用累加器的常用技巧来避免 tfb demonstrates. But as amalloy 指出的 n 堆栈帧，这几乎不是这个函数的瓶颈，因为它的内存使用 n。在标准 ML 变体中，过多的列表连接会花费更多。

因此，根据您打算对此列表执行的操作，您可能需要考虑在标准 ML 中生成其元素并一次处理一个元素（就像惰性流允许您这样做）；例如，与其生成一个长列表并对其进行过滤，不如生成过滤后的列表。这是一个示例：Translation of Pythagorean Triplets from Haskell to Standard ML.

从字典和维度生成所有可能组合的功能性尾递归方式

Functional, tail-recursive way to generate all possible combinations from a dictionary and a dimension

recursion

scheme

haskell

functional-programming

sml

关于内存效率的附录：