Haskell 使用带有 Parsec 的累加器进行解析

Question

Haskell初学者。

说我有一个解析器，我提供了一些信息，它 returns 解析一个部分的结果和下一部分所需的信息。

readSection :: Info -> Parser (Section, Info)

我希望能够使用 many 组合器解析多个部分，但我无法让这些类型起作用。我需要某种方法让解析器接受先前计算的 Info 。这个结构让我想起了折叠（因为之前的结果只能存储在累加器中），但我不确定如何在这里应用它。

readSections :: Info -> Parser [Section]
readSections startInfo = do
  let parser = readSection startInfo
  -- ???
  return $ many readSection

也许有更好的方法来做这样的事情，所以任何指导将不胜感激。谢谢！

Answer 1

可以以明确的方式编写 many 的“扩展”版本，但此答案使用 StateT from "transformers" in order to reuse an already existing Alternative 实例。

函数

readSection :: Info -> Parser (Section, Info)

让人想起 StateT constructor:

StateT :: (Info -> Parser (Section, Info)) -> StateT Info Parser Section

其中 Info 是状态，Parser 是基础 monad。即：每次我们解析 Section 时，我们修改下一次解析尝试将使用的 Info。

此外，StateT还有以下实例：

(Functor m, MonadPlus m) => Alternative (StateT s m)

如果基础 monad 是 MonadPlus（like Parsec is) then StateT over the monad is an Alternative. This means that we can use the many from Alternative——但不是来自“parsec”的更受限制的 many :: ParsecT s u m a -> ParsecT s u m [a]。

所以我们要写：

import Control.Applicative
import Control.Monad.Trans.State.Strict

readSections :: Info -> Parser [Section]
readSections = evalStateT (Control.Applicative.many parser) 
  where
    parser :: StateT Info Parser Section
    parser = StateT readSection

使用 evalStateT，这使我们回到简单的 Parser 并丢弃最终的 Info 状态。

Answer 2

如果您正在寻找现有的组合器，这看起来像是 monad-loops（以及其他地方）中 unfoldrM 的应用。为了避免必须导入它，它可以定义为：

unfoldrM :: Monad m => (a -> m (Maybe (b, a))) -> a -> m [b]
unfoldrM f a = do
  r <- f a
  case r of
    Nothing -> return []
    Just (b, a') -> (b:) <$> unfoldrM f a'

基本上，它采用 a 类型的初始种子，并使用它来单子生成值和新种子。你的种子就是你的信息：你使用信息来单次生成一个解析的部分加上一个新版本的信息。

您只需插入 optionMaybe 以提供 Just/Nothing 开关，以便 unfoldrM 知道它何时到达所有部分的末尾：

readSections = unfoldrM (optionMaybe . readSection)

但是，作为 Haskell 初学者，了解您如何从头开始做这类事情可能会有所帮助。 many 组合器并不神奇。基本等价于比较简单的单子计算：

many p = (do x <- p
             xs <- many p
             return (x:xs))
         <|> return []

因此，您可以以类似的方式从头开始将 readSections 编写为一元计算：

readSections' :: Info -> Parser [Section]
readSections' i = (do -- try to read a section, getting updated info
                      (s, i') <- readSection i
                      -- read zero or more remaining sections with updated info
                      ss <- readSections' i'
                      -- return the sections
                      return (s:ss))
                  -- if no more sections to read, return an empty list
                  <|> return []

Haskell 使用带有 Parsec 的累加器进行解析

Haskell parsing using an accumulator with Parsec

haskell

parsec