When to use foldr with a continuation as an accumulation function?

有一种技术我在 foldr 中见过几次。它涉及使用函数代替 foldr 中的累加器。我想知道什么时候有必要这样做,而不是使用只是一个常规值的累加器。

大多数人在使用 foldr 定义 foldl:

myFoldl :: forall a b. (b -> a -> b) -> b -> [a] -> b
myFoldl accum nil as = foldr f id as nil
    f :: a -> (b -> b) -> b -> b
    f a continuation b = continuation $ accum b a

这里,合并函数的类型f不是普通的a -> b -> b,而是a -> (b -> b) -> b -> b。它不仅需要一个 ab,还需要一个延续 (b -> b),我们需要将 b 传递给它以获得最终的 b .

我最近在书中看到一个使用这个技巧的例子 Parallel and Concurrent Programming in Haskell. Here is a link to the source code of the example using this trick. Here 是书中解释这个例子的章节的 link。

我冒昧地将源代码简化为一个类似(但更短)的示例。下面是一个获取字符串列表的函数,打印出每个字符串的长度是否大于 5,然后仅打印长度大于 5 的字符串的完整列表:

import Text.Printf

stringsOver5 :: [String] -> IO ()
stringsOver5 strings = foldr f (print . reverse) strings []
    f :: String -> ([String] -> IO ()) -> [String] -> IO ()
    f str continuation strs = do
      let isGreaterThan5 = length str > 5
      printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
      if isGreaterThan5
        then continuation $ str : strs
        else continuation strs

这里有一个在 GHCi 中使用它的例子:

> stringsOver5 ["subdirectory", "bye", "cat", "function"]
Working on "subdirectory", greater than 5? True
Working on "bye", greater than 5? False
Working on "cat", greater than 5? False
Working on "function", greater than 5? True

就像在 myFoldl 示例中一样,您可以看到组合函数 f 使用了相同的技巧。

但是,我想到这个 stringsOver5 函数可能不用这个技巧就可以编写:

stringsOver5PlainFoldr :: [String] -> IO ()
stringsOver5PlainFoldr strings = foldr f (pure []) strings >>= print
    f :: String -> IO [String] -> IO [String]
    f str ioStrs = do
      let isGreaterThan5 = length str > 5
      printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
      if isGreaterThan5
        then fmap (str :) ioStrs
        else ioStrs

(虽然也许你可以提出 IO [String] is a continuation 的论点?)


I have two questions regarding this:


  • Is it every absolutely necessary to use this trick of passing a continuation to foldr? Is there an example of a function that absolutely can't be written without this trick? (Aside from foldl and functions like that, of course.)

不,从来没有。每个 foldr 调用总是可以被显式递归替换。

人们应该使用 foldr 和其他众所周知的库函数来简化代码。如果他们不这样做,则不应硬塞代码以使其符合 foldr 模式。



stringsOver5 :: [String] -> IO ()
stringsOver5 strings = go strings []
  go :: [String] -> [String] -> IO ()
  go []     acc = print (reverse acc)
  go (s:ss) acc = do
      let isGreaterThan5 = length str > 5
      printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
      if isGreaterThan5
        then go ss (s:acc)
        else go ss acc
  • When would I want to use this trick in my own code? Is there any example of a function that can be significantly simplified by using this trick?


就个人而言,我发现 "calling foldr with four (or more) arguments" 在大多数情况下是一种反模式。这是因为它并不比使用显式递归短,而且可读性可能更差。

我认为这个 "idiom" 对于任何以前没有见过它的 Haskeller 来说都是非常令人费解的。可以说,这是一种后天习得的品味。


foldr (.) id listOfDLists []

很漂亮,即使最后一个 [] 一开始可能会令人费解。

  • Is there any sort of performance considerations to take into account when using this trick? (Or, well, when not using this trick?)

性能应该与使用显式递归基本相同。 GHC 甚至可以生成完全相同的代码。

也许使用 foldr 可以帮助 GHC 激发一些 fold/build 优化规则,但我不确定在使用延续时是否需要这样做。