使用 FParsec 是否可以在解析器失败时操纵错误位置?

With FParsec is it possible to manipulate the error position when a parser fails?

例如,我将以 Phillip Trelford 的 simple C# parser 为例。为了解析一个标识符,他写了这个(略有改动):

let reserved = ["for";"do"; "while";"if";"switch";"case";"default";"break" (*;...*)]
let pidentifierraw =
    let isIdentifierFirstChar c = isLetter c || c = '_'
    let isIdentifierChar c = isLetter c || isDigit c || c = '_'
    many1Satisfy2L isIdentifierFirstChar isIdentifierChar "identifier"
let pidentifier =
    pidentifierraw
    >>= fun s ->
        if reserved |> List.exists ((=) s) then fail "keyword instead of identifier"
        else preturn s

pidentifier 的问题在于,当它失败时,位置指示符位于流的末尾。我的一个例子:

Error in Ln: 156 Col: 41 (UTF16-Col: 34)
        Block "main" 116x60 font=default fg=textForeground
                                        ^
Note: The column count assumes a tab stop distance of 8 chars.
keyword instead of identifier

显然,这不是 C# 片段,但为了示例,我使用了 pidentifier 来解析 font= 之后的文本。是否可以告诉 FParsec 在解析输入的开头显示错误?使用 >>?.>>.? 或任何回溯变体似乎都没有效果。

我想你想要的是 attempt p,如果解析器 p 失败,它将回溯到原始解析器状态。所以你可以将 pidentifier 定义为:

let pidentifier =
    pidentifierraw
    >>= fun s ->
        if reserved |> List.exists ((=) s) then fail "keyword instead of identifier"
        else preturn s
    |> attempt   // rollback on failure

输出是这样的:

Failure:
Error in Ln: 1 Col: 1
default
^

The parser backtracked after:
  Error in Ln: 1 Col: 8
  default
         ^
  Note: The error occurred at the end of the input stream.
  keyword instead of identifier

更新

如果你不想在错误信息中看到回溯信息,你可以使用attempt的简化版本,像这样:

let attempt (parser : Parser<_, _>) : Parser<_, _> =
    fun stream ->
        let mutable state = CharStreamState(stream)
        let reply = parser stream
        if reply.Status <> Ok then
            stream.BacktrackTo(&state)
        reply

输出现在只是:

Failure:
Error in Ln: 1 Col: 1
default
^
keyword instead of identifier

很遗憾我错过了>>=? operator which apparently is (at least semantically) equivalent to attempt that

这两种方法的问题在于,如果前面的解析器也在回溯,则后续消息 The parser backtracked after:[…] 可能会级联:

Error in Ln: 156 Col: 29 (UTF16-Col: 22)
        Block "main" 116x60 font=default fg=textForeground
                            ^
Note: The column count assumes a tab stop distance of 8 chars.
Expecting: space/tab

The parser backtracked after:
  Error in Ln: 156 Col: 42 (UTF16-Col: 35)
          Block "main" 116x60 font=default fg=textForeground
                                           ^
  Note: The column count assumes a tab stop distance of 8 chars.
  Expecting: space/tab
  Other error messages:
    keyword instead of identifier