替换 ByteString 中的换行符
Replace newlines in ByteString
我想要一个接受 ByteString 并将换行符 \n
和 \n\r
替换为逗号的函数,但想不出一个好的方法。
import qualified Data.ByteString as BS
import Data.Char (ord)
import Data.Word (Word8)
endlWord8 = fromIntegral $ ord '\n' :: Word8
replace :: BS.ByteString -> BS.ByteString
我想过使用 BS.map
,但看不出如何使用,因为我无法在 Word8
上进行模式匹配。另一种选择是 BS.split
然后用 Word8 逗号连接,但这听起来很慢而且不优雅。有任何想法吗?
使用 Data.ByteString.Char8
摆脱讨厌的 Word8
、Char
转换,否则您必须这样做。根据 Data.ByteString.Char8 first sentence 不应更改性能。
另外使用 B.span
而不是 B.split
,因为您还想替换 \n\r
组合,而不仅仅是 \n
.
我自己的(可能很笨拙)尝试这样做:
module Test where
import Data.Monoid ((<>))
import Data.ByteString.Char8 (ByteString)
import qualified Data.ByteString.Char8 as B
import qualified Data.ByteString.Builder as Build
import qualified Data.ByteString.Lazy as LB
eatNewline :: ByteString -> (Maybe Char, ByteString)
eatNewline string
| B.null string = (Nothing, string)
| B.head string == '\n' && B.null (B.tail string) = (Just ',', B.empty)
| B.head string == '\n' && B.head (B.tail string) /= '\r' = (Just ',', B.drop 1 string)
| B.head string == '\n' && B.head (B.tail string) == '\r' = (Just ',', B.drop 2 string)
| otherwise = (Nothing, string)
replaceNewlines :: ByteString -> ByteString
replaceNewlines = LB.toStrict . Build.toLazyByteString . go mempty
where
go :: Build.Builder -> ByteString -> Build.Builder
go builder string = let (chunk, rest) = B.span (/= '\n') string
(c, rest1) = eatNewline rest
maybeComma = maybe mempty Build.char8 c
in if B.null rest1 then
builder <> Build.byteString chunk <> maybeComma
else
go (builder <> Build.byteString chunk <> maybeComma) rest1
希望 Data.ByteString.Builder
的 mappend
与 mappend
已用于其中一个操作数的次数不是线性关系,否则,此处将出现二次算法.
我想要一个接受 ByteString 并将换行符 \n
和 \n\r
替换为逗号的函数,但想不出一个好的方法。
import qualified Data.ByteString as BS
import Data.Char (ord)
import Data.Word (Word8)
endlWord8 = fromIntegral $ ord '\n' :: Word8
replace :: BS.ByteString -> BS.ByteString
我想过使用 BS.map
,但看不出如何使用,因为我无法在 Word8
上进行模式匹配。另一种选择是 BS.split
然后用 Word8 逗号连接,但这听起来很慢而且不优雅。有任何想法吗?
使用 Data.ByteString.Char8
摆脱讨厌的 Word8
、Char
转换,否则您必须这样做。根据 Data.ByteString.Char8 first sentence 不应更改性能。
另外使用 B.span
而不是 B.split
,因为您还想替换 \n\r
组合,而不仅仅是 \n
.
我自己的(可能很笨拙)尝试这样做:
module Test where
import Data.Monoid ((<>))
import Data.ByteString.Char8 (ByteString)
import qualified Data.ByteString.Char8 as B
import qualified Data.ByteString.Builder as Build
import qualified Data.ByteString.Lazy as LB
eatNewline :: ByteString -> (Maybe Char, ByteString)
eatNewline string
| B.null string = (Nothing, string)
| B.head string == '\n' && B.null (B.tail string) = (Just ',', B.empty)
| B.head string == '\n' && B.head (B.tail string) /= '\r' = (Just ',', B.drop 1 string)
| B.head string == '\n' && B.head (B.tail string) == '\r' = (Just ',', B.drop 2 string)
| otherwise = (Nothing, string)
replaceNewlines :: ByteString -> ByteString
replaceNewlines = LB.toStrict . Build.toLazyByteString . go mempty
where
go :: Build.Builder -> ByteString -> Build.Builder
go builder string = let (chunk, rest) = B.span (/= '\n') string
(c, rest1) = eatNewline rest
maybeComma = maybe mempty Build.char8 c
in if B.null rest1 then
builder <> Build.byteString chunk <> maybeComma
else
go (builder <> Build.byteString chunk <> maybeComma) rest1
希望 Data.ByteString.Builder
的 mappend
与 mappend
已用于其中一个操作数的次数不是线性关系,否则,此处将出现二次算法.