为什么打印不强制整个惰性 IO 值？

Question

我正在使用 http-client 教程使用 TLS 连接获取响应正文。由于我可以观察到 print 被 withResponse 调用，为什么 print 不强制对以下片段中的输出做出完整响应？

withResponse request manager $ \response -> do
    putStrLn $ "The status code was: " ++
    body <- (responseBody response)
    print body

我需要这样写：

response <- httpLbs request manager

putStrLn $ "The status code was: " ++
           show (statusCode $ responseStatus response)
print $ responseBody response

我要打印的正文是一个惰性字节串。我仍然不确定我是否应该期望 print 打印整个值。

instance Show ByteString where
    showsPrec p ps r = showsPrec p (unpackChars ps) r

Answer 1

这与懒惰无关，但与使用 Simple 模块获得的 Response L.ByteString 和使用 TLS 模块获得的 Response BodyReader 之间的区别有关。

您注意到 BodyReader 是 IO ByteString。但特别是它是一个可以重复的动作，每次使用 next 字节块。它遵循从不发送空字节串的协议，除非它位于文件末尾。（BodyReader 可能被称为 ChunkGetter）。下面的 bip 就像你写的那样：从 Response 中提取 BodyReader/IO ByteString 后，它执行它以获取第一个块，并打印它。但不会重复获得更多的动作 - 所以在这种情况下，我们只看到创世记的前几章。您需要的是一个循环来耗尽块，如下面的 bop 所示，这会导致整个 King James Bible 溢出到控制台中。

{-# LANGUAGE OverloadedStrings #-} 
import Network.HTTP.Client
import Network.HTTP.Client.TLS
import qualified Data.ByteString.Char8 as B

main = bip
-- main = bop

bip = do 
  manager <- newManager tlsManagerSettings
  request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
  withResponse request manager $ \response -> do
      putStrLn "The status code was: "  
      print (responseStatus response)
      chunk  <- responseBody response
      B.putStrLn chunk

bop = do 
  manager <- newManager tlsManagerSettings
  request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
  withResponse request manager $ \response -> do
      putStrLn "The status code was: " 
      print (responseStatus response)
      let loop = do 
            chunk <- responseBody response
            if B.null chunk 
              then return () 
              else B.putStr chunk  >> loop 
      loop

循环不断返回以获取更多块，直到它获得一个表示 eof 的空字符串，因此在终端中它打印到 Apocalypse 的末尾。

这种行为很简单，但有点技术性。您只能通过手写递归来处理 BodyReader。但是 http-client 库的目的是使 http-conduit 之类的事情成为可能。 withResponse 的结果类型为 Response (ConduitM i ByteString m ())。 ConduitM i ByteString m () 是如何管道类型的字节流；这个字节流将包含整个文件。

在http-client/http-conduitmaterial的原始形式中，Response包含这样一个管道； BodyReader 部分后来被分解到 http-client 中，因此它可以被不同的流媒体库使用，例如 pipes.

所以举个简单的例子，在对应的http material 中streaming and streaming-bytestring libraries, withHTTP 给你一个Response (ByteString IO ()) 类型的响应。 ByteString IO ()顾名思义，就是IO中出现的字节流类型； ByteString Identity () 相当于一个惰性字节串（实际上是一个纯粹的块列表。）在这种情况下，ByteString IO () 将代表整个字节流，一直到启示录。所以随着进口

 import qualified Data.ByteString.Streaming.HTTP as Bytes -- streaming-utils
 import qualified Data.ByteString.Streaming.Char8 as Bytes -- streaming-bytestring

该程序与惰性字节串程序相同：

bap = do 
    manager <- newManager tlsManagerSettings
    request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
    Bytes.withHTTP request manager $ \response -> do 
        putStrLn "The status code was: "
        print (responseStatus response)
        Bytes.putStrLn $ responseBody response

实际上它稍微简单一些，因为您没有“从 IO 中提取字节”:

        lazy_bytes <- responseStatus response
        Lazy.putStrLn lazy_bytes

但只写

        Bytes.putStrLn $ responseBody response

你直接 "print" 他们。如果你只想从 KJV 的中间看一点，你可以用惰性字节串做你想做的，并以：

结尾

        Bytes.putStrLn $ Bytes.take 1000 $ Bytes.drop 50000 $ responseBody response

然后你会看到一些关于亚伯拉罕的东西。

withHTTP for streaming-bytestring 只是隐藏了我们需要直接从 http-client 使用 BodyReader material 的递归循环。这是一样的，例如使用 pipes-http 中的 withHTTP，它表示字节串块流 Producer ByteString IO ()，http-conduit 也是如此。在所有这些情况下，一旦您掌握了字节流，您就可以按照流式 IO 框架的典型方式处理它，而无需手写递归。他们都使用 http-client 中的 BodyReader 来做到这一点，这是图书馆的主要目的。

为什么打印不强制整个惰性 IO 值？

Why doesn't print force entire lazy IO value?

haskell

conduit

lazy-io

haskell-pipes

http-conduit