yacc: 冲突: 1 reduce/reduce

yacc: conflicts: 1 reduce/reduce

为了学习 Lex/Yacc,我正在按照 RFC 4180 第 3 页指定的语法编写 CSV 解析器。

我 运行 陷入了“reduce/reduce 冲突”,我不确定如何进行。这似乎是我的语法规则 1 和规则 3 之间的冲突,但我不知道在最后一条记录之后有或没有换行符的任何其他方式来描述 CSV。此外,当我删除规则 10(空字段规则)时,reduce/reduce 冲突消失;但是,我需要处理空字段。

我的语法有什么问题,我应该如何纠正它?

Yacc 来源

%token COMMA
%token DQUOTE
%token CRLF
%token TEXTDATA

%%

file: records CRLF
    | records;

records: records CRLF record
       | record;

record: fields;

fields: fields COMMA field
      | field;

field: DQUOTE escaped DQUOTE
     | TEXTDATA
     | ;

escaped: escaped TEXTDATA
       | escaped COMMA
       | escaped CRLF
       | escaped DQUOTE DQUOTE
       | TEXTDATA
       | COMMA
       | CRLF
       | DQUOTE DQUOTE;

yacc -v输出

State 14 conflicts: 1 reduce/reduce


Grammar

    0 $accept: file $end

    1 file: records CRLF
    2     | records

    3 records: records CRLF record
    4        | record

    5 record: fields

    6 fields: fields COMMA field
    7       | field

    8 field: DQUOTE escaped DQUOTE
    9      | TEXTDATA
   10      | /* empty */

   11 escaped: escaped TEXTDATA
   12        | escaped COMMA
   13        | escaped CRLF
   14        | escaped DQUOTE DQUOTE
   15        | TEXTDATA
   16        | COMMA
   17        | CRLF
   18        | DQUOTE DQUOTE


Terminals, with rules where they appear

$end (0) 0
error (256)
COMMA (258) 6 12 16
DQUOTE (259) 8 14 18
CRLF (260) 1 3 13 17
TEXTDATA (261) 9 11 15


Nonterminals, with rules where they appear

$accept (7)
    on left: 0
file (8)
    on left: 1 2, on right: 0
records (9)
    on left: 3 4, on right: 1 2 3
record (10)
    on left: 5, on right: 3 4
fields (11)
    on left: 6 7, on right: 5 6
field (12)
    on left: 8 9 10, on right: 6 7
escaped (13)
    on left: 11 12 13 14 15 16 17 18, on right: 8 11 12 13 14


state 0

    0 $accept: . file $end

    DQUOTE    shift, and go to state 1
    TEXTDATA  shift, and go to state 2

    $default  reduce using rule 10 (field)

    file     go to state 3
    records  go to state 4
    record   go to state 5
    fields   go to state 6
    field    go to state 7


state 1

    8 field: DQUOTE . escaped DQUOTE

    COMMA     shift, and go to state 8
    DQUOTE    shift, and go to state 9
    CRLF      shift, and go to state 10
    TEXTDATA  shift, and go to state 11

    escaped  go to state 12


state 2

    9 field: TEXTDATA .

    $default  reduce using rule 9 (field)


state 3

    0 $accept: file . $end

    $end  shift, and go to state 13


state 4

    1 file: records . CRLF
    2     | records .
    3 records: records . CRLF record

    CRLF  shift, and go to state 14

    $default  reduce using rule 2 (file)


state 5

    4 records: record .

    $default  reduce using rule 4 (records)


state 6

    5 record: fields .
    6 fields: fields . COMMA field

    COMMA  shift, and go to state 15

    $default  reduce using rule 5 (record)


state 7

    7 fields: field .

    $default  reduce using rule 7 (fields)


state 8

   16 escaped: COMMA .

    $default  reduce using rule 16 (escaped)


state 9

   18 escaped: DQUOTE . DQUOTE

    DQUOTE  shift, and go to state 16


state 10

   17 escaped: CRLF .

    $default  reduce using rule 17 (escaped)


state 11

   15 escaped: TEXTDATA .

    $default  reduce using rule 15 (escaped)


state 12

    8 field: DQUOTE escaped . DQUOTE
   11 escaped: escaped . TEXTDATA
   12        | escaped . COMMA
   13        | escaped . CRLF
   14        | escaped . DQUOTE DQUOTE

    COMMA     shift, and go to state 17
    DQUOTE    shift, and go to state 18
    CRLF      shift, and go to state 19
    TEXTDATA  shift, and go to state 20


state 13

    0 $accept: file $end .

    $default  accept


state 14

    1 file: records CRLF .
    3 records: records CRLF . record

    DQUOTE    shift, and go to state 1
    TEXTDATA  shift, and go to state 2

    $end      reduce using rule 1 (file)
    $end      [reduce using rule 10 (field)]
    $default  reduce using rule 10 (field)

    record  go to state 21
    fields  go to state 6
    field   go to state 7


state 15

    6 fields: fields COMMA . field

    DQUOTE    shift, and go to state 1
    TEXTDATA  shift, and go to state 2

    $default  reduce using rule 10 (field)

    field  go to state 22


state 16

   18 escaped: DQUOTE DQUOTE .

    $default  reduce using rule 18 (escaped)


state 17

   12 escaped: escaped COMMA .

    $default  reduce using rule 12 (escaped)


state 18

    8 field: DQUOTE escaped DQUOTE .
   14 escaped: escaped DQUOTE . DQUOTE

    DQUOTE  shift, and go to state 23

    $default  reduce using rule 8 (field)


state 19

   13 escaped: escaped CRLF .

    $default  reduce using rule 13 (escaped)


state 20

   11 escaped: escaped TEXTDATA .

    $default  reduce using rule 11 (escaped)


state 21

    3 records: records CRLF record .

    $default  reduce using rule 3 (records)


state 22

    6 fields: fields COMMA field .

    $default  reduce using rule 6 (fields)


state 23

   14 escaped: escaped DQUOTE DQUOTE .

    $default  reduce using rule 14 (escaped)

如果输入,比如TEXTDATA CRLF,不清楚是先导出file -> records CRLF再导出records到单条记录,还是先导出file -> records 然后将 records 派生为两条记录,其中第二条记录仅包含一个空字段。

为避免这种歧义,您可以删除 records CRLF 替代项。以 CRLF 结尾的文件仍将被接受 - 它们将被视为末尾有一个空字段。

如果这不是你想要的,你需要重写fields,让最后一条记录不允许为空(然后保留file: records CRLF产生)。

PS:顺便说一句,在我看来,您应该将一些解析工作移至词法分析器,特别是解析引用字符串内容的部分。像 "abc" 这样的东西最好通过让词法分析器把它变成一个单一的标记来处理。