使用 Parsec3 解析文本输入并获取文本输出（不是字符串）

Question

我看到 Parsec3 处理 Text（而不是 String）输入，所以我想转换旧的 String 解析器以获得文本输出。我使用的其他库也使用 Text，这样可以减少所需的转换次数。

现在，parsec3 库似乎按照它说的去做（处理 Text 和 String 输入），这个例子来自 gchi:

Text.Parsec.Text Text.Parsec Data.Text> parseTest (many1 $  char 's') (pack "sss")
"sss"
Text.Parsec.Text Text.Parsec Data.Text> parseTest (many1 $  char 's') "sss"
"sss"

所以，Text（第一种情况）和String（第二种情况）都有效。

现在，在我真正的、转换的解析器中（抱歉，我必须在这里拼凑代码的一些远程部分来制作一个完整的示例）

{-# LANGUAGE OverloadedStrings #-}
data UmeQueryPart = MidQuery Text Text MatchType

data MatchType = Strict | Fuzzy deriving Show

funcMT :: Text -> MatchType
funcMT mt = case mt of
        "~" -> Fuzzy
        _ -> Strict

midOfQuery :: Parser UmeQueryPart
midOfQuery = do
    spaces
    string "MidOf"
    spaces
    char '('
    spaces
    clabeltype <- many1 alphaNum
    spaces
    sep <- try( char ',') <|> char '~'
    spaces
    plabeltype <- many1 alphaNum
    spaces
    char ')'
    spaces
    return $ MidQuery (pack plabeltype) (pack clabeltype) (funcMT sep)

关于 funcMT 调用，我发现自己有很多这样的错误

UmeQueryParser.hs:456:96:
    Couldn't match type ‘[Char]’ with ‘Text’
    Expected type: Text
      Actual type: String
    In the first argument of ‘funcMT’, namely ‘sep’
    In the fifth argument of ‘ midOfQuery’, namely ‘(funcMT sep)’

如果我没有明确地 pack 上面代码示例中的捕获文本，则：

UmeQueryParser.hs:288:26:
    Couldn't match expected type ‘Text’ with actual type ‘[Char]’
    In the first argument of ‘ midOfQuery’, namely ‘(plabeltype)’
    In the second argument of ‘($)’, namely
      ‘StartQuery (plabeltype) (clabeltype) (funcMT sep)’

所以，我似乎需要将捕获的字符串显式转换为输出中的 Text。

那么，为什么我需要经历从 String 或 Char 到 Text 的步骤，而重点是进行 Text -> Text 解析？

Answer 1

您可以制作自己的 Text 解析器，像

这样简单的东西

midOfQuery :: Parser UmeQueryPart
midOfQuery = do
    spaces
    lexeme $ string "MidOf"
    lexeme $ char '('
    clabeltype <- lexeme alphaNums
    sep <- lexeme $ try (char ',') <|> char '~'
    plabeltype <- lexeme alphaNums
    lexeme $ char ')'
    return $ MidQuery plabeltype clabeltype (funcMT sep)
  where
    alphaNums = pack <$> many1 alphaNum
    lexeme p = p <* spaces

或者，稍微紧凑一点（但我认为仍然更具可读性）：

midOfQuery :: Parser UmeQueryPart
midOfQuery = spaces *> lexeme (string "MidOf") *> parens (toQuery <$> lexeme alphaNums <*> lexeme matchType <*> lexeme alphaNums)
  where
    lexeme :: Parser a -> Parser a
    lexeme p = p <* spaces

    alphaNums = pack <$> many1 alphaNum

    parens = between (lexeme $ char '(') (lexeme $ char ')')

    matchType = Fuzzy <$ char '~' <|>
                Strict <$ char ','

    toQuery cLabelType sep pLabelType = MidQuery pLabelType cLabelType sep

使用 Parsec3 解析文本输入并获取文本输出（不是字符串）

Parse Text input and get Text output (not String) with Parsec3

string

text

haskell

parsec