模糊的上下文无关语法? / Shift/Reduce CUP 冲突

Ambiguous Context-free grammar? / Shift/Reduce conflict in CUP

我有以下用于简化版 C++ 的上下文无关语法。当我使用 JFLEX 和 CUP 运行 时,我得到了这样的错误列表:

Warning : *** Reduce/Reduce conflict found in state #173
  between especificador ::= (*) 
  and     programa ::= (*) 
  under symbols: {VOID, CHAR, FLOAT, DOUBLE, SIGNED, UNSIGNED, INT, SHORT, LONG}
  Resolved in favor of the second production.

我认为问题出在 instrucoesIf 上,但我想不通

start with programa;

programa ::= especificador tipo ID programa2 | DEFINE ID num CRLF programa | ; verificar depois o ERRO
especificador ::= AUTO | STATIC | EXTERN | CONST | ;
tipo ::= VOID | CHAR | FLOAT | DOUBLE | SIGNED inteiro | UNSIGNED inteiro | inteiro;

inteiro ::= SHORT | INT | LONG;
programa2 ::= SEMICOLON programa | LBRACK num RBRACK SEMICOLON programa | LPAREN listaParametros RPAREN bloco programa | COMMA listaID programa;

listaID ::= ID declaracaoParam2 listaIDTail;
listaIDTail ::= SEMICOLON | COMMA listaID;
listaParametros ::= listaParamRestante | ;
listaParamRestante ::= declaracaoParam declParamRestante;
declaracaoParam ::= tipo ID declaracaoParam2;
declaracaoParam2 ::= LBRACK num RBRACK | ;
declParamRestante ::= COMMA listaParamRestante | ;
bloco ::= LBRACE conjuntoInst RBRACE | SEMICOLON conjuntoInst ;
conjuntoInst ::= programa conjuntoInst | instrucoes conjuntoInst | ;
instrucoes ::= ID expressao SEMICOLON | RETURN expr SEMICOLON | PRINTF LPAREN expr RPAREN SEMICOLON | SCANF LPAREN ID RPAREN SEMICOLON | BREAK SEMICOLON | IF LPAREN expr RPAREN instrucoes instrucoesIf;

instrucoesIf ::= ELSE instrucoes | ;
expressao ::= atribuicao | LBRACK expr RBRACK atribuicao | LPAREN exprList RPAREN | ;
atribuicao ::= operadorAtrib expr;
operadorAtrib ::= EQ | MULTEQ | DIVEQ | MODEQ | PLUSEQ | MINUSEQ;
expr ::= exprAnd exprOr;
exprList ::= expr exprListTail | ;
exprListTail ::= COMMA exprList | ;
exprOr ::= OR exprAnd exprOr | ;
exprAnd ::= exprEqual exprAnd2;
exprAnd2 ::= AND exprEqual exprAnd2 | ;
exprEqual ::= exprRelational exprEqual2;
exprEqual2 ::= EQEQ exprRelational exprEqual2 | NOTEQ exprRelational exprEqual2 | ;
exprRelational ::= exprPlus exprRelational2;
exprRelational2 ::= LT exprPlus exprRelational2 | LTEQ exprPlus exprRelational2 | GT exprPlus exprRelational2 | GTEQ exprPlus exprRelational2 | ;

exprPlus ::= exprMult exprPlus2;
exprPlus2 ::= PLUS exprMult exprPlus2 | MINUS exprMult exprPlus2 | ;
exprMult ::= exprUnary exprMult2;
exprMult2 ::= MULT exprUnary exprMult2 | DIV exprUnary exprMult2 | ;
exprUnary ::= PLUS exprParenthesis | MINUS exprParenthesis | exprParenthesis;
exprParenthesis ::= LPAREN expr RPAREN | primary;
primary ::= ID primaryID | num | literal;
primaryID ::= LBRACK primary RBRACK | LPAREN exprList RPAREN | ;

literal ::= STRING | CHAR;
num ::= NUM_INT | NUM_FLOAT;

CUP 是一个 LALR 解析器生成器,这意味着无需避免左递归,因此您可以使用更自然的语法风格;没有必要将列表拆分为“start”和“continuation”的产生式,这很难阅读且容易出错,而且通常不允许直接构建准确的语法树。

你的语法有很多冲突,但这个明显有歧义:

conjuntoInst ::= programa conjuntoInst | instrucoes conjuntoInst | ;

请注意 programa 也有一个空选项。所以在 conjuntoInst 的开头(或者实际上,在中间)可以有任意数量的空 programa。您需要非常小心不能导出任何内容的非终结符;您必须避免在无分隔列表中使用它们,因为解析器无法判断源文本中存在多少个连续的空非终结符。

请注意,C++ 和 C 中都不允许完全空的语句。完全空的语句会产生这种歧义。空语句仍必须以 ; 结束。这使得拥有语句列表成为可能。

所以 C 语句(包括块)(省略了很多细节)的常用模型是:

block ::= '{' statementList '}' ;
statementList ::= | statementList statement ;
statement ::= emptyStatement | expressionStatement | ifStatement | ...
            | declaration
emptyStatement ::= ';'
expressionStatement ::= expression ';'
   ...
ifStatement ::= IF '(' expression ')' statement
              | IF '(' expression ')' statement ELSE statement
declaration ::= modifiers type ID ...