了解潜在的冲突

Question

我有一个由 Menhir 构建的小型表达式解析器。我试图通过在 parser.mly:

中编写恢复语法来在解析期间恢复括号不完整的表达式

%{ 
   open AST
%}

%token<int> LINT 
%token<string> ID
%token LPAREN RPAREN COMMA
%token EOF PLUS STAR EQ

%start<AST.expression> expressionEOF

%right LPAREN RPAREN
%nonassoc EQ
%left PLUS
%left STAR

%%

expressionEOF: e=expression EOF
{
  e
}

expression:
| x=LINT
{
  Int x
}
| x=identifier
{
  Read x
}
| e1=expression b=binop e2=expression
{
  Binop (b, e1, e2)
}
| e1=expression b=binop
(* for "2+", "2*3+" *)
{
  Binop (b, e1, FakeExpression)
}
| LPAREN e=expression RPAREN
{
  Paren e
}
| LPAREN RPAREN
(* for "()" *)
{
  Paren FakeExpression
}
| LPAREN
(* for "(" *)
{
  ParenMissingRparen FakeExpression
}
| LPAREN e=expression 
(* for "(1", "(1+2", "(1+2*3", "((1+2)" *)
{
  ParenMissingRparen e
}
| RPAREN
(* for ")" *)
{
  ExtraRparen FakeExpression
}
| e=expression RPAREN 
(* for "3)", "4))", "2+3)" *)
{
  ExtraRparen e
}

%inline binop:
  PLUS { Add   }
| STAR { Mul   }
| EQ   { Equal }

identifier: x=ID
{
  Id x
}

它适用于一组不完整的表达式。但是，menhir --explain parser.mlyreturns以下parser.conflict：

** Conflict (reduce/reduce) in state 10.
** Tokens involved: STAR RPAREN PLUS EQ EOF
** The following explanations concentrate on token STAR.
** This state is reached from expressionEOF after reading:

LPAREN expression RPAREN

** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)

expressionEOF 
expression EOF 
expression STAR expression // lookahead token appears
(?)

** In state 10, looking ahead at STAR, reducing production
** expression -> LPAREN expression RPAREN
** is permitted because of the following sub-derivation:

LPAREN expression RPAREN . 

** In state 10, looking ahead at STAR, reducing production
** expression -> expression RPAREN
** is permitted because of the following sub-derivation:

LPAREN expression // lookahead token is inherited
       expression RPAREN . 

** Conflict (reduce/reduce) in state 3.
** Tokens involved: STAR RPAREN PLUS EQ EOF
** The following explanations concentrate on token STAR.
** This state is reached from expressionEOF after reading:

LPAREN RPAREN

** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)

expressionEOF 
expression EOF 
expression STAR expression // lookahead token appears
(?)

** In state 3, looking ahead at STAR, reducing production
** expression -> LPAREN RPAREN
** is permitted because of the following sub-derivation:

LPAREN RPAREN . 

** In state 3, looking ahead at STAR, reducing production
** expression -> RPAREN
** is permitted because of the following sub-derivation:

LPAREN expression // lookahead token is inherited
       RPAREN .

我不明白它试图解释什么。谁能告诉我什么可能是潜在的冲突（优先举例）以及解决方案是什么？

Answer 1

你有：

expr: '(' expr ')'
    | '(' expr
    |     expr ')'

因此，您希望 ( x ) 匹配第一条规则：

         expr
      -> '('  expr ')'  (rule 1)

它是做什么的。但它也匹配另一种方式：

         expr
      -> expr       ')' (rule 3)
      -> '(' expr   ')' (rule 2)

它也像这样匹配：

         expr
      -> '('   expr     (rule 2)
      -> '('   expr ')' (rule 3)

由于您还让 expr 匹配 ( 和 )，因此 ( ) 也可以通过多种方式匹配，包括 expr ')'（与 expr -> '('), 或者 '(' expr (with expr -> ')').

“解决方案”是放弃尝试添加对无效句子的识别。解析应该因语法错误而失败；一旦失败，您可以尝试使用 Menhir 的错误恢复机制来生成错误消息并继续解析。请参阅手册的第 11 节。

了解潜在的冲突

Understand potential conflicts

parsing

ocaml

menhir