如何在通过解析器组合器（makeExprParser）库调用函数后正确解析字段访问？

Question

我想像这样解析表达式：a().x。它应该看起来像 EAttrRef (EFuncCall (EVarRef "a") []) "x"。不幸的是，我的表达式解析器停止得太早了，它只解析了 a() 然后就停止了。

1:4:
  |
1 | a().x
  |    ^
unexpected '.'
expecting end of input

代码：

pExpr :: Parser Expr
pExpr = lexeme p & dbg "pExpr" <?> "expression"
  where
    pTerm = try pVarRef <|> pELit
    p = makeExprParser pTerm exprTable
    exprTable = [[Postfix opIndexRef], [InfixL opAttrRef], [Postfix opFuncCall]]
    opAttrRef :: Parser (Expr -> Expr -> Expr)
    opAttrRef = do
      symbol "." & dbg "opAttrRef symbol \".\""
      return r
      where
        r x (EVarRef y) = EAttrRef x y
        r x y = error [qq|opAttrRef got unexpected right operand $y (left operand was $x)|]
    opFuncCall :: Parser (Expr -> Expr)
    opFuncCall = do
      symbol "("
      args <- sepBy pExpr (symbol ",")
      symbol ")" & dbg "opFuncCall symbol \")\""
      return $ \funcExpr -> EFuncCall funcExpr args
    opIndexRef = do
      symbol "["
      e <- pExpr
      symbol "]" & dbg "opIndexRef symbol \"]\""
      return $ \obj -> EIndexRef obj e

调试输出：

opAttrRef symbol "."> IN: "().x"
opAttrRef symbol "."> MATCH (EERR): <EMPTY>
opAttrRef symbol "."> ERROR:
opAttrRef symbol "."> offset=1:
opAttrRef symbol "."> unexpected '('
opAttrRef symbol "."> expecting '.'

pExpr> IN: ").x"
pExpr> MATCH (EERR): <EMPTY>
pExpr> ERROR:
pExpr> offset=2:
pExpr> unexpected ").x"
pExpr> expecting "false", "null", "true", '"', '+', '-', '[', digit, identifier, or integer

opFuncCall symbol ")"> IN: ").x"
opFuncCall symbol ")"> MATCH (COK): ')'
opFuncCall symbol ")"> VALUE: ")"

pExpr> IN: "a().x"
pExpr> MATCH (COK): "a()"
pExpr> VALUE: EFuncCall (EVarRef "a") []

在我看来，makeExprParser 没有第二次调用 opFuncCall（与索引访问调试输出的外观相比），但我不知道为什么。

当我降低 opAttrRef 优先级时它会解析，但随后会生成错误的树（例如 x.a() 的右操作数将是 a() 这是不正确的，它应该是 a 然后整个想法应该在函数调用中），所以我不能使用它（我很确定当前的优先级是正确的，因为它基于该语言的引用）。

Answer 1

您当前的表达式解析器类似于以下 BNF：

expr = funcOp ;
funcOp = attrOp , { "(" , expr, ")" } ;
attrOp = attrOp , "." , indexOp | indexOp ;
indexOp = term , { "[", expr, "]" } ;

一旦完成解析 funcCall，它就不会在运算符 table 中返回并解析任何 attrRef 或 indexRef。

降低 opAttrRef 优先级的问题是，当您似乎希望解析器从左到右读取时，会分别解析点的左侧和右侧，并且能够混合 funcCall、attrRef 或 indexRef 中的任何一个。因此，如果您希望能够解析类似 a[b](c).d(e)[f] 的内容，我建议将 opAttrRef 从中缀更改为后缀，并将运算符 table 展平为：

exprTable = [[Postfix opIndexRef, PostFix opAttrRef, Postfix opFuncCall]]

此时解析器变为：

expr = term , { indexRef | attrRef | funcCall } ;

如果您需要允许多个后缀运算符，您可以像这样重写表达式解析器：

p = liftM2 (foldl (flip ($))) pTerm (many (opAttrRef <|> opIndexRef <|> opFuncCall))

如果您想添加算术、逻辑和其他常用运算符，p 解析器可用作 makeExprParser 的术语解析器。

如何在通过解析器组合器（makeExprParser）库调用函数后正确解析字段访问？

How to correctly parse field access after function call via parser-combinators (makeExprParser) library?

haskell

parser-combinators

megaparsec