为什么 streaming-bytestring 给我错误 "openBinaryFile: resource exhausted (Too many open files)"？

Question

streaming-bytestring 库在打印大约 512 字节后出现错误。

错误：

openBinaryFile: resource exhausted (Too many open files)

代码：

import           Control.Monad.Trans (lift, MonadIO)
import           Control.Monad.Trans.Resource (runResourceT, MonadResource, MonadUnliftIO, ResourceT, liftResourceT)
import qualified Data.ByteString.Streaming          as BSS
import qualified Data.ByteString.Streaming.Char8    as BSSC
import           System.TimeIt

main :: IO ()
main = timeIt $ runResourceT $ dump $ BSS.drop 24 $ BSS.readFile "filename"

dump :: MonadIO m => BSS.ByteString m r -> m ()
dump bs = do
    isEmpty <- BSS.null_ bs
    if isEmpty then return ()
    else do
        BSSC.putStr $ BSS.take 1 bs
        dump $ BSS.drop 1 bs

Answer 1

使用流式库时，重复使用有效的流通常不是一个好主意。也就是说，您可以应用 drop or splitAt to a stream and then continue working with the resulting stream, or you can consume the stream as a whole with a function like fold 之类的函数，这使您处于基础 monad 中。但是你不应该将相同的流值应用于两个不同的函数。

遗憾的是，目前的 Haskell 类型系统无法在编译时强制执行该限制，它需要某种形式的 linear types。相反，它成为用户的责任。

null_ function is perhaps a wart in the streaming-bytestringapi，因为它没有return一个新的流连同结果，给人的印象是整个API流重用是正常的.如果它有像这样的签名会更好 null_ :: ByteString m r -> m (Bool, ByteString m r).

同样，不要使用 drop and take with the same stream value. Instead, use splitAt or uncons 并使用除法结果。

dump :: MonadIO m => BSS.ByteString m r -> m ()
dump bs = do
    mc <- BSSC.uncons bs -- bs is only used once
    case mc of
        Left _ -> return ()
        Right (c,rest) -> do liftIO $ putChar c
                             dump rest

所以，关于错误。正如@BobDalgleish 在评论中提到的，发生的事情是在调用 null_ 时打开文件（这是我们第一次 "demand" 来自流的东西）。在递归调用中，我们再次传递原始 bs 值，因此它将再次打开文件，每次迭代一次，直到我们达到文件句柄限制。

就我个人而言，我不喜欢使用 ResourceT with streaming libraries. I prefer opening the file with withFile 然后在可能的情况下通过回调创建和使用流。但有些事情那样更难。

为什么 streaming-bytestring 给我错误 "openBinaryFile: resource exhausted (Too many open files)"？

Why is streaming-bytestring giving me error "openBinaryFile: resource exhausted (Too many open files)"?

haskell

haskell-streaming