使用 FParsec 是否可以在解析器失败时操纵错误位置?
With FParsec is it possible to manipulate the error position when a parser fails?
例如,我将以 Phillip Trelford 的 simple C# parser 为例。为了解析一个标识符,他写了这个(略有改动):
let reserved = ["for";"do"; "while";"if";"switch";"case";"default";"break" (*;...*)]
let pidentifierraw =
let isIdentifierFirstChar c = isLetter c || c = '_'
let isIdentifierChar c = isLetter c || isDigit c || c = '_'
many1Satisfy2L isIdentifierFirstChar isIdentifierChar "identifier"
let pidentifier =
pidentifierraw
>>= fun s ->
if reserved |> List.exists ((=) s) then fail "keyword instead of identifier"
else preturn s
pidentifier 的问题在于,当它失败时,位置指示符位于流的末尾。我的一个例子:
Error in Ln: 156 Col: 41 (UTF16-Col: 34)
Block "main" 116x60 font=default fg=textForeground
^
Note: The column count assumes a tab stop distance of 8 chars.
keyword instead of identifier
显然,这不是 C# 片段,但为了示例,我使用了 pidentifier
来解析 font=
之后的文本。是否可以告诉 FParsec 在解析输入的开头显示错误?使用 >>?
、.>>.?
或任何回溯变体似乎都没有效果。
我想你想要的是 attempt p
,如果解析器 p
失败,它将回溯到原始解析器状态。所以你可以将 pidentifier
定义为:
let pidentifier =
pidentifierraw
>>= fun s ->
if reserved |> List.exists ((=) s) then fail "keyword instead of identifier"
else preturn s
|> attempt // rollback on failure
输出是这样的:
Failure:
Error in Ln: 1 Col: 1
default
^
The parser backtracked after:
Error in Ln: 1 Col: 8
default
^
Note: The error occurred at the end of the input stream.
keyword instead of identifier
更新
如果你不想在错误信息中看到回溯信息,你可以使用attempt
的简化版本,像这样:
let attempt (parser : Parser<_, _>) : Parser<_, _> =
fun stream ->
let mutable state = CharStreamState(stream)
let reply = parser stream
if reply.Status <> Ok then
stream.BacktrackTo(&state)
reply
输出现在只是:
Failure:
Error in Ln: 1 Col: 1
default
^
keyword instead of identifier
很遗憾我错过了>>=?
operator which apparently is (at least semantically) equivalent to attempt
that 。
这两种方法的问题在于,如果前面的解析器也在回溯,则后续消息 The parser backtracked after:[…]
可能会级联:
Error in Ln: 156 Col: 29 (UTF16-Col: 22)
Block "main" 116x60 font=default fg=textForeground
^
Note: The column count assumes a tab stop distance of 8 chars.
Expecting: space/tab
The parser backtracked after:
Error in Ln: 156 Col: 42 (UTF16-Col: 35)
Block "main" 116x60 font=default fg=textForeground
^
Note: The column count assumes a tab stop distance of 8 chars.
Expecting: space/tab
Other error messages:
keyword instead of identifier
例如,我将以 Phillip Trelford 的 simple C# parser 为例。为了解析一个标识符,他写了这个(略有改动):
let reserved = ["for";"do"; "while";"if";"switch";"case";"default";"break" (*;...*)]
let pidentifierraw =
let isIdentifierFirstChar c = isLetter c || c = '_'
let isIdentifierChar c = isLetter c || isDigit c || c = '_'
many1Satisfy2L isIdentifierFirstChar isIdentifierChar "identifier"
let pidentifier =
pidentifierraw
>>= fun s ->
if reserved |> List.exists ((=) s) then fail "keyword instead of identifier"
else preturn s
pidentifier 的问题在于,当它失败时,位置指示符位于流的末尾。我的一个例子:
Error in Ln: 156 Col: 41 (UTF16-Col: 34)
Block "main" 116x60 font=default fg=textForeground
^
Note: The column count assumes a tab stop distance of 8 chars.
keyword instead of identifier
显然,这不是 C# 片段,但为了示例,我使用了 pidentifier
来解析 font=
之后的文本。是否可以告诉 FParsec 在解析输入的开头显示错误?使用 >>?
、.>>.?
或任何回溯变体似乎都没有效果。
我想你想要的是 attempt p
,如果解析器 p
失败,它将回溯到原始解析器状态。所以你可以将 pidentifier
定义为:
let pidentifier =
pidentifierraw
>>= fun s ->
if reserved |> List.exists ((=) s) then fail "keyword instead of identifier"
else preturn s
|> attempt // rollback on failure
输出是这样的:
Failure:
Error in Ln: 1 Col: 1
default
^
The parser backtracked after:
Error in Ln: 1 Col: 8
default
^
Note: The error occurred at the end of the input stream.
keyword instead of identifier
更新
如果你不想在错误信息中看到回溯信息,你可以使用attempt
的简化版本,像这样:
let attempt (parser : Parser<_, _>) : Parser<_, _> =
fun stream ->
let mutable state = CharStreamState(stream)
let reply = parser stream
if reply.Status <> Ok then
stream.BacktrackTo(&state)
reply
输出现在只是:
Failure:
Error in Ln: 1 Col: 1
default
^
keyword instead of identifier
很遗憾我错过了>>=?
operator which apparently is (at least semantically) equivalent to attempt
that
这两种方法的问题在于,如果前面的解析器也在回溯,则后续消息 The parser backtracked after:[…]
可能会级联:
Error in Ln: 156 Col: 29 (UTF16-Col: 22)
Block "main" 116x60 font=default fg=textForeground
^
Note: The column count assumes a tab stop distance of 8 chars.
Expecting: space/tab
The parser backtracked after:
Error in Ln: 156 Col: 42 (UTF16-Col: 35)
Block "main" 116x60 font=default fg=textForeground
^
Note: The column count assumes a tab stop distance of 8 chars.
Expecting: space/tab
Other error messages:
keyword instead of identifier