如何在 FParsec 中解析递归文法

How to parse recusrive grammar in FParsec

我无法使用以前的问题来让它工作

Recursive grammars in FParsec

我想为我最近发现自己正在编写的一种语言构建一种 Lexer 和项目系统。该语言称为 q。它是一种相当简单的语言,没有运算符优先级。例如 1*2+3(1*(2+3)) 相同。它的工作原理有点像反向波兰符号计算器,计算从右到左。

我在用 FParsec 表达这个问题时遇到了问题。我整理了以下简化的演示

open FParsec

type BinaryOperator = BinaryOperator of string
type Number = Number of string

type Element =
  |Number of Number
and Expression = 
  |Element of Element
  |BinaryExpression of Element * BinaryOperator * Expression

let number = regex "\d+\.?\d*" |>> Number.Number

let element = [ number ] |> choice |>> Element.Number

let binaryOperator = ["+"; "-"; "*"; "%"] |> Seq.map pstring |> choice |>> BinaryOperator

let binaryExpression expression = pipe3 element binaryOperator expression (fun l o r -> (l,o,r))
let expression =

  let exprDummy, expRef = createParserForwardedToRef()
  let elemExpr = element |>> Element
  let binExpr = binaryExpression exprDummy |>> BinaryExpression
  expRef.Value <- [binExpr; elemExpr; ] |> choice
  expRef

let statement = expression.Value .>> eof

let parseString s =
  printfn "Parsing input: '%s'" s

  match run statement s with
  | Success(result, _, _)   -> printfn "Ok: %A" result
  | Failure(errorMsg, _, _) -> printfn "Error: %A" errorMsg

//tests
parseString "1.23"
parseString "1+1"
parseString "1*2+3" // equivalent to (1*(2+3))

到目前为止,我还没有想出一种方法来满足所有 3 个测试用例。在上面,它首先尝试解析 binExpr,意识到它不能,但随后必须使用输入,因为它接下来不会尝试评估 elemExpr。不知道该怎么办。如何满足这 3 项测试?

思考 Tomas' answer,我想出了以下可行的方法

let expr, expRef = createParserForwardedToRef()
let binRightExpr = binaryOperator .>>. expr
expRef.Value <- parse{
  let! first = element
  return! choice [
      binRightExpr |>> (fun (o, r) -> (first, o, r) |> BinaryExpression)
      preturn (first |> Element)
    ]
}

let statement = expRef.Value .>> eof

第一个解析器失败的原因在 FParsec docs

中给出

The behaviour of the <|> combinator has two important characteristics:

<|> only tries the parser on the right side if the parser on the left side fails. It does not implement a longest match rule. However, it only tries the right parser if the left parser fails without consuming input.

可能需要清理一些东西,比如 AST 的结构,但我想我可以继续了。