Data.ByteString 和 Data.ByteString.Char8 的区别

Difference between Data.ByteString and Data.ByteString.Char8

我了解到 Char8 仅支持 ASCII 字符,如果您使用其他 Unicode 字符,使用起来会很危险

{-# LANGUAGE OverloadedStrings #-}

--import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as BC
import qualified Data.Text.IO as TIO
import qualified Data.Text.Encoding as E
import qualified Data.Text as T

name :: T.Text
name = "{ \"name\": \"哈时刻\" }"

nameB :: BC.ByteString
nameB = E.encodeUtf8 name

main :: IO ()
main = do
  BC.writeFile "test.json" nameB
  putStrLn "done"

产生与

相同的结果
{-# LANGUAGE OverloadedStrings #-}

import qualified Data.ByteString as B
--import qualified Data.ByteString.Char8 as BC
import qualified Data.Text.IO as TIO
import qualified Data.Text.Encoding as E
import qualified Data.Text as T

name :: T.Text
name = "{ \"name\": \"哈时刻\" }"

nameB :: B.ByteString
nameB = E.encodeUtf8 name

main :: IO ()
main = do
  B.writeFile "test.json" nameB
  putStrLn "done"

那么使用 Data.ByteString.Char8Data.ByteString

有什么区别

如果比较 Data.ByteStringData.ByteString.Char8,您会注意到前者引用 Word8 的一堆函数在后者引用 Char

-- Data.ByteString
map :: (Word8 -> Word8) -> ByteString -> ByteString
cons :: Word8 -> ByteString -> ByteString
snoc :: ByteString -> Word8 -> ByteString
head :: ByteString -> Word8
uncons :: ByteString -> Maybe (Word8, ByteString) 
{- and so on... -}


-- Data.ByteString.Char8
map :: (Char -> Char) -> ByteString -> ByteString
cons :: Char -> ByteString -> ByteString
snoc :: ByteString -> Char -> ByteString
head :: ByteString -> Char
uncons :: ByteString -> Maybe (Char, ByteString) 
{- and so on... -}

对于这些函数,并且仅针对这些函数,Data.ByteString.Char8 提供了便利,无需不断地将 Word8 值转换为 Char 值或从中转换出值。 writeFile 在两个模块中做完全相同的事情。

这是查看 TextByteStringByteString.Char8 中类似函数的不同行为的好方法:

{-# LANGUAGE OverloadedStrings #-}

import Data.Text.Encoding

import qualified Data.Text as T
import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as BC

nameText :: T.Text
nameText = "哈时刻"

nameByteString :: B.ByteString
nameByteString = encodeUtf8 nameText

main :: IO ()
main = do
  print $ T.head nameText               -- '704'     actual first character
  print $ B.head nameByteString         -- 229          first byte
  print $ BC.head nameByteString        -- '9'       first byte as character

  putStrLn [ T.head nameText ]          -- 哈           actual first character
  putStrLn [ BC.head nameByteString ]   -- å            first byte as character