如何编写不消耗 space 的解析器？

Question

我正在编写一个程序来修改源代码文件。我需要解析文件（例如使用 megaparsec），修改它的抽象语法树 AST（例如使用 Uniplate），并以尽可能少的更改重新生成文件（例如保留空格、注释……）。

因此，AST 应包含空格，例如：

data Identifier = Identifier String String

其中第一个字符串是标识符的名称，第二个是后面的空格。这同样适用于语言中的任何符号。

如何为标识符编写解析器？

Answer 1

我最后写了 parseLexeme，以替换 this tutorial

中的 lexeme

data Lexeme a = Lexeme a String -- String contains the spaces after the lexeme

whites :: Parser String
whites = many spaceChar

parseLexeme :: Parser a -> Parser (Lexeme a)
parseLexeme p = do
  value <- p
  w <- whites
  return $ Lexeme value w

instance PPrint a => PPrint (Lexeme a) where
  pprint (Lexeme value w) = (pprint value) ++ w

标识符的解析器变为：

data Identifier = Identifier (Lexeme String)

parseIdentifier :: Parser Identifier
parseIdentifier = do
  v <- parseLexeme $ (:) <$> letterChar <*> many (alphaNumChar <|> char '_')
  return $ Identifier v

instance PPrint Identifier where
  pprint (Identifier l) = pprint l

如何编写不消耗 space 的解析器？

how to write a parser that does not consume space?

haskell

megaparsec