Bison 中的嵌套 Shift/reduce 冲突?

Nested Shift/reduce conflict in bison?

我是新手,我想了解这里发生了什么,我得到两个轮班减少冲突。我有语法(如果我遗漏了什么我可以添加所需的规则):

%start translation_unit

%token EOF      0               "end of file"

%token                      LBRA        "["
%token                      RBRA        "]"
%token                      LPAR        "("
%token                      RPAR        ")"
%token                      DOT         "."
%token                      ARROW       "->"
%token                      INCDEC      "++ or --"
%token                      COMMA       ","
%token                      AMP         "&"
%token                      STAR        "*"
%token                      ADDOP       "+ or -"
%token                      EMPH        "!"
%token                      DIVOP       "/ or %"
%token                      CMPO        "<, >, <=, or >="
%token                      CMPE        "== or !="
%token                      DAMP        "&&"
%token                      DVERT       "||"
%token                      ASGN        "="
%token                      CASS        "*=, /=, %=, +=, or -="
%token                      SEMIC       ";"
%token                      LCUR        "{"
%token                      RCUR        "}"
%token                      COLON       ":"

%token                      TYPEDEF     "typedef"
%token                      VOID        "void"
%token                      ETYPE       "_Bool, char, or int"
%token                      STRUCT      "struct"
%token                      ENUM        "enum"
%token                      CONST       "const"
%token                      IF          "if"
%token                      ELSE        "else"
%token                      DO          "do"
%token                      WHILE       "while"
%token                      FOR         "for"
%token                      GOTO        "goto"
%token                      CONTINUE    "continue"
%token                      BREAK       "break"
%token                      RETURN      "return"
%token                      SIZEOF      "sizeof"

%token<string>              IDF         "identifier"
%token<string>              TYPEIDF     "type identifier"
%token<int>                 INTLIT      "integer literal"
%token<string>              STRLIT      "string literal"

/////////////////////////////////

%%

/////////////////////////////////

primary_expression:
    IDF
    | INTLIT
    | STRLIT
    | LPAR expression RPAR
    ;

postfix_expression:
    primary_expression
    | postfix_expression LBRA expression RBRA
    | postfix_expression LPAR argument_expression_list_opt RPAR
    | postfix_expression DOT IDF
    | postfix_expression ARROW IDF
    | postfix_expression INCDEC
    ;

argument_expression_list_opt:
    %empty
    | argument_expression_list
    ;

argument_expression_list:
    assignment_expression
    | argument_expression_list COMMA assignment_expression
    ;

unary_expression:
    postfix_expression
    | INCDEC unary_expression
    | unary_operator cast_expression
    | SIZEOF LPAR type_name RPAR
    ;

unary_operator:
    AMP
    | STAR
    | ADDOP
    | EMPH
    ;

cast_expression:
    unary_expression
    ;

multiplicative_expression:
    cast_expression
    | multiplicative_expression STAR cast_expression
    | multiplicative_expression DIVOP cast_expression
    ;

additive_expression:
    multiplicative_expression
    | additive_expression ADDOP multiplicative_expression
    ;

relational_expression:
    additive_expression
    | relational_expression CMPO additive_expression
    ;

equality_expression:
    relational_expression
    | equality_expression CMPE relational_expression
    ;

logical_AND_expression:
    equality_expression
    | logical_AND_expression DAMP equality_expression
    ;

logical_OR_expression:
    logical_AND_expression
    | logical_OR_expression DVERT logical_AND_expression
    ;

assignment_expression:
    logical_OR_expression
    | unary_expression assignment_operator assignment_expression
    ;

assignment_operator:
    ASGN 
    | CASS
    ;

expression:
    assignment_expression
    ;

constant_expression:
    logical_OR_expression
    ;

declaration:
    no_leading_attribute_declaration
    ;

no_leading_attribute_declaration:
    declaration_specifiers init_declarator_list_opt SEMIC
    ;

init_declarator_list_opt:
    %empty
    | init_declarator_list
    ;

declaration_specifiers:
    declaration_specifier
    | declaration_specifier declaration_specifiers
    ;

declaration_specifier:
    storage_class_specifier
    | type_specifier_qualifier
    ;

init_declarator_list:
    init_declarator
    | init_declarator_list COMMA init_declarator
    ;

init_declarator:
    declarator
    ;

storage_class_specifier:
    TYPEDEF
    ;

type_specifier:
    VOID
    | ETYPE
    | struct_or_union_specifier
    | enum_specifier
    | typedef_name
    ;

struct_or_union_specifier:
    struct_or_union IDF LCUR member_declaration_list RCUR
    | struct_or_union IDF
    ;

struct_or_union:
    STRUCT
    ;

member_declaration_list:
    member_declaration
    | member_declaration_list member_declaration
    ;

member_declaration:
    specifier_qualifier_list member_declarator_list_opt SEMIC
    ;

member_declarator_list_opt:
    %empty
    | member_declarator_list
    ;

specifier_qualifier_list:
    type_specifier_qualifier
    | type_specifier_qualifier specifier_qualifier_list
    ;

type_specifier_qualifier:
    type_specifier
    | type_qualifier
    ;

member_declarator_list:
    member_declarator
    | member_declarator_list COMMA member_declarator
    ;

member_declarator:
    declarator
    ;

enum_specifier:
    ENUM IDF LCUR enumerator_list RCUR
    | ENUM IDF LCUR enumerator_list COMMA RCUR
    | ENUM IDF
    ;

enumerator_list:
    enumerator
    | enumerator_list COMMA enumerator
    ;

enumerator:
    ENUM
    | ENUM ASGN constant_expression
    ;

type_qualifier:
    CONST
    ;

pointer:
    STAR type_qualifier_list_opt
    | STAR type_qualifier_list_opt pointer
    ;

pointer_opt:
    %empty
    | pointer
    ;

declarator:
    pointer_opt direct_declarator
    ;

direct_declarator:
    IDF
    | LPAR declarator RPAR
    | array_declarator
    | function_declarator
    ;

abstract_declarator:
    pointer
    | pointer_opt direct_abstract_declarator
    ;

direct_abstract_declarator:
    LPAR abstract_declarator RPAR
    | array_abstract_declarator
    | function_abstract_declarator
    ;

direct_abstract_declarator_opt:
    %empty
    | direct_abstract_declarator
    ;

array_abstract_declarator:
    direct_abstract_declarator_opt LBRA assignment_expression RBRA
    ;

function_abstract_declarator:
    direct_abstract_declarator_opt LPAR parameter_type_list RPAR
    ;

array_declarator:
    direct_declarator LBRA assignment_expression RBRA
    ;

function_declarator:
    direct_declarator LPAR parameter_type_list RPAR
    ;

type_qualifier_list_opt:
    %empty
    | type_qualifier_list
    ;

type_qualifier_list:
    type_qualifier
    | type_qualifier_list type_qualifier
    ;

parameter_type_list:
    parameter_list
    ;

parameter_list:
    parameter_declaration
    | parameter_list COMMA parameter_declaration
    ;

parameter_declaration:
    declaration_specifiers declarator
    | declaration_specifiers abstract_declarator_opt
    ;
abstract_declarator_opt:
    %empty
    | abstract_declarator
    ;

type_name:
    specifier_qualifier_list abstract_declarator_opt
    ;

typedef_name:
    TYPEIDF
    ;

statement:
    matched
    | unmatched
    | other
    ;

other:
    expression_statement
    | compound_statement
    | jump_statement
    ;

matched:
    selection_statement_c
    | iteration_statement_c
    ;

unmatched:
    selection_statement_o
    | iteration_statement_o
    ;

compound_statement:
    LCUR block_item_list_opt RCUR
    ;

block_item_list_opt:
    %empty
    | block_item_list
    ;

block_item_list:
    block_item
    | block_item_list block_item
    ;

block_item:
    declaration
    | statement
    ;

expression_statement:
    expression_opt SEMIC
    ;

expression_opt:
    %empty
    | expression
    ;

selection_statement_o:
    IF LPAR expression RPAR statement
    | IF LPAR expression RPAR matched ELSE unmatched
    ;

selection_statement_c:
    IF LPAR expression RPAR matched ELSE matched
    ;

iteration_statement_o:
    WHILE LPAR expression RPAR unmatched
    | FOR LPAR expression_opt SEMIC expression_opt SEMIC expression_opt RPAR unmatched
    ;

iteration_statement_c:
    WHILE LPAR expression RPAR matched
    | DO statement WHILE LPAR expression RPAR SEMIC
    | FOR LPAR expression_opt SEMIC expression_opt SEMIC expression_opt RPAR matched
    ;

jump_statement:
    RETURN expression_opt SEMIC
    ;

translation_unit:
    external_declaration
    | translation_unit external_declaration
    ;

external_declaration:
    function_definition
    | declaration
    ;

function_definition:
    declaration_specifiers declarator compound_statement
    ;

/////////////////////////////////

%%

我遇到了这个 shift/reduce 冲突:

State 156

   85 declarator: pointer_opt . direct_declarator
   91 abstract_declarator: pointer_opt . direct_abstract_declarator

    "("           shift, and go to state 187
    "identifier"  shift, and go to state 44

    "("       [reduce using rule 111 (direct_abstract_declarator_opt)]
    ^^conflict with previous "(" ^^
    $default  reduce using rule 111 (direct_abstract_declarator_opt)

...

State 197

   91 abstract_declarator: pointer_opt . direct_abstract_declarator

    "("  shift, and go to state 213
    "("       [reduce using rule 111 (direct_abstract_declarator_opt)]
    ^^conflict^^

    $default  reduce using rule 111 (direct_abstract_declarator_opt)


所以我理解为:当我得到“(”时,我可以做两件事。首先,从 direct_declarator 我得到 LPAR 声明符 RPAR,其中 LPAR 是“(”
...或...
我可以再次减少 direct_abstract_declarator->array_abstract_declarator->direct_abstract_declarator_opt->direct_abstract_declarator->LPAR abstract_declarator RPAR,其中 LPAR 是“(”。所以我无法决定哪种方式向右走?
我应该如何解决这个冲突?还是我完全看错了?

创建 shift-reduce 冲突的最简单方法之一是引入一个表示可选实例的新非终端:

something_opt
    : something
    | %empty

有时这会奏效,但更常见的是,在某些情况下 something 可能是也可能不是可选的。换句话说,您有两个使用 something:

的不同作品
a: something something_else "END_A"
b: something_opt something_else "END_B"

这不是模棱两可的,但是它不能用一个标记向前看来解析,因为在识别出something之后,解析器必须决定是否将其缩减为something_opt,如果下面的文字变成了bsomething_opt 在语义上基本上没有意义;有 something 或没有 something。如果没有那个细节,解析器就没有问题;它可以继续解析 something_else,然后根据它是否遇到 END_AEND_B,它可以决定是否减少 ab

解决方法很简单:不用定义 something_opt,只需复制产生式,这样一个产生式有 something,另一个没有。

在你的例子中,有问题的可选非终端是 direct_abstract_declarator_opt:

direct_abstract_declarator_opt:
    %empty
    | direct_abstract_declarator
 
array_abstract_declarator:
    direct_abstract_declarator_opt LBRA assignment_expression RBRA

function_abstract_declarator:
    direct_abstract_declarator_opt LPAR parameter_type_list RPAR

这些是唯一使用它的地方,所以我们可以通过复制其他两个非终端的产生式来摆脱它:

array_abstract_declarator:
    direct_abstract_declarator LBRA assignment_expression RBRA
    | LBRA assignment_expression RBRA

function_abstract_declarator:
    direct_abstract_declarator LPAR parameter_type_list RPAR
    | LPAR parameter_type_list RPAR

如果可能的话,您应该使用此模型来编写带有可选非终结符的语法。它确实需要复制语义操作,但希望语义操作只是对 AST 节点构建器的简单调用。而且它会为你省去很多不必要的悲伤。