从 State 切换到 StateT 后，如何恢复单子构造列表的延迟评估？

Question

使用以下代码：

(lazy_test.hs)

-- Testing lazy evaluation of monadically constructed lists, using State.
import Control.Monad.State

nMax = 5

foo :: Int -> State [Int] Bool
foo n = do
  modify $ \st -> n : st
  return (n `mod` 2 == 1)

main :: IO ()
main = do
  let ress = for [0..nMax] $ \n -> runState (foo n) []
      sts  = map snd $ dropWhile (not . fst) ress
  print $ head sts

for = flip map

我可以将 nMax 设置为 5，即 50,000,000，我得到大致相同的运行时间：

nMax = 5:

$ stack ghc lazy_test.hs
[1 of 1] Compiling Main             ( lazy_test.hs, lazy_test.o )
Linking lazy_test ...

$ time ./lazy_test
[1]

real    0m0.019s
user    0m0.002s
sys     0m0.006s

nMax = 50,000,000:

$ stack ghc lazy_test.hs
[1 of 1] Compiling Main             ( lazy_test.hs, lazy_test.o )
Linking lazy_test ...

$ time ./lazy_test
[1]

real    0m0.020s
user    0m0.002s
sys     0m0.005s

根据我对惰性求值机制的理解，这是我所期望的。

但是，如果我从 State 切换到 StateT：

(lazy_test2.hs)

-- Testing lazy evaluation of monadically constructed lists, using StateT.
import Control.Monad.State

nMax = 5

foo :: Int -> StateT [Int] IO Bool
foo n = do
  modify $ \st -> n : st
  return (n `mod` 2 == 1)

main :: IO ()
main = do
  ress <- forM [0..nMax] $ \n -> runStateT (foo n) []
  let sts  = map snd $ dropWhile (not . fst) ress
  print $ head sts

for = flip map

然后我看到各自运行次之间的极端差异：

nMax = 5:

$ stack ghc lazy_test2.hs
[1 of 1] Compiling Main             ( lazy_test2.hs, lazy_test2.o )
Linking lazy_test2 ...

$ time ./lazy_test2 
[1]

real    0m0.019s
user    0m0.002s
sys     0m0.004s

nMax = 50,000,000:

$ stack ghc lazy_test2.hs
[1 of 1] Compiling Main             ( lazy_test2.hs, lazy_test2.o )
Linking lazy_test2 ...

$ time ./lazy_test2 
[1]

real    0m29.758s
user    0m25.488s
sys     0m4.231s

我假设这是因为当我切换到基于 StateT 的实现时，我失去了对单子构造列表的惰性评估。

对吗？
我可以恢复单子构造列表的惰性求值，同时保持基于 StateT 的实现吗？

Answer 1

在您的示例中，每个 runState 仅运行执行一个 foo 操作，因此您使用 State and/or StateT 本质上是无关紧要的。您可以将 foo 的使用替换为等效的：

import Control.Monad

nMax = 50000000

main :: IO ()
main = do
  ress <- forM [0..nMax] $ \n -> return (n `mod` 2 == 1, [n])
  let sts  = map snd $ dropWhile (not . fst) ress
  print $ head sts

它的行为方式相同。

问题在于 IO monad 的严格性。如果您运行在 Identity monad 中进行此计算：

import Control.Monad
import Data.Functor.Identity

nMax = 50000000

main :: IO ()
main = do
  let ress = runIdentity $ forM [0..nMax] $ \n -> return (n `mod` 2 == 1, [n])
  let sts  = map snd $ dropWhile (not . fst) ress
  print $ head sts

然后它会运行懒惰。

如果你想在 IO monad 中懒惰地运行，你需要用 unsafeInterleaveIO 明确地做到这一点，所以下面的方法会起作用：

import System.IO.Unsafe
import Control.Monad

nMax = 50000000

main :: IO ()
main = do
  ress <- lazyForM [0..nMax] $ \n -> return (n `mod` 2 == 1, [n])
  let sts  = map snd $ dropWhile (not . fst) ress
  print $ head sts

lazyForM :: [a] -> (a -> IO b) -> IO [b]
lazyForM (x:xs) f = do
  y <- f x
  ys <- unsafeInterleaveIO (lazyForM xs f)
  return (y:ys)
lazyForM [] _ = return []

Answer 2

K A Buhr 的另一个回答解释了为什么 State vs StateT 不是相关因素（IO 是），并指出你的例子是如何奇怪地构造的（在State(T) 部分实际上并未使用，因为每个数字都使用新状态 []）。但是除了这些点之外，我不确定我会说 "losing lazy evaluation of the monadically constructed list"，因为如果我们理解像 "lazy evaluation = evaluated only when needed" 这样的东西，那么 foo 确实需要在每个元素上运行在输入列表上以执行所有效果，因此惰性评估不是 "lost"。你得到了你想要的。（碰巧 foo 不执行任何 IO，也许其他人可以评论 compiler/GHC 是否有可能在此基础上优化它，但您可以很容易地理解为什么GHC 在这里做了天真的事情。）

这是 Haskell 中常见的众所周知的问题。有各种库（其中最著名的是 streaming、pipes、conduit）通过为您提供 streams（基本上是列表）来解决问题效果也很懒惰。如果我以流式传输方式重新创建您的示例，

import Data.Function ((&))
import Control.Monad.State
import Streaming
import qualified Streaming.Prelude as S

foo :: Int -> StateT [Int] IO Bool
foo n = 
  (n `mod` 2 == 1) <$ modify (n:)

nMax :: Int
nMax = 5000000

main :: IO ()
main = do
  mHead <- S.head_ $ S.each [0..nMax]
                   & S.mapM (flip runStateT [] . foo)
                   & S.dropWhile (not . fst)
  print $ snd <$> mHead

然后两个版本运行几乎是瞬间完成的。为了使差异更加明显，假设 foo 也称为 print "hi"。然后流媒体版本，在效果上是惰性的，只会打印两次，而你的原始版本会打印 nMax 次。因为他们在效果上很懒惰，所以不需要遍历整个列表来短路和提前完成。

从 State 切换到 StateT 后，如何恢复单子构造列表的延迟评估？

How do I recover lazy evaluation of a monadically constructed list, after switching from State to StateT?

haskell

lazy-evaluation

state-monad

monad-transformers

io-monad