Shift/reduce 使用 Sablecc 时发生冲突
Shift/reduce conflicts using Sablecc
我应该使用 Sablecc 为 MiniPython 编写一个 .grammar 文件。我遇到了这些 shift/reduce 冲突:
shift/reduce conflict in state [stack: TIf PTpower *] on TMult in {
[ PMltp = * TMult PTopower Mltp ] (shift)
[ PMlpt = * ] followed by TMult (reduce)
}
shift/reduce conflict in state [stack: TIf PTopower *] on TDiv in {
[ PMltp = * TDiv PTopower Mltp ] (shift)
[ PMltp = * ] followed by TDiv (reduce)
}
一些代币是:
id = letter (letter | digit)*;
digit = ['0' .. '9'];
letter = ['a' .. 'z']|['A' .. 'Z'];
pow = '**';
mult = '*';
div = '/';
plus = '+';
minus = '-';
assert = 'assert';
l_par = '(';
r_par = ')';
l_bra = '[';
r_bra = ']';
我的 .grammar 文件的一部分是这样的:
expression = multiplication exprsn;
exprsn = {addition} plus multiplication exprsn
| {subtraction} minus multiplication exprsn
| {empty};
topower = something tpwr;
tpwr = {topower} pow something tpwr
| {empty};
multiplication = topower mltp;
mltp = {multiplication} mult topower mltp
| {division} div topower mltp
| {empty};
something = {id} id
| {parexp} l_par expression r_par
| {fcall} functioncall
| {value} value
| {list} id l_bra expression r_bra
| {other} l_bra value comval* r_bra
| {assert} assert expression comexpr?;
comexpr = comma expression;
这是我尝试消除左递归后的语法。我注意到如果我从 something
生产中删除 assert
规则,我就不会发生冲突。此外,从 exprsn
、tpwr
和 mltp
规则中删除 {empty}
规则不会给我带来任何冲突,但我认为这不是解决此问题的正确方法。
如有任何提示,我们将不胜感激。
更新:这是整个语法,在按照要求删除左递归之前:
Package minipython;
Helpers
digit = ['0' .. '9'];
letter = ['a' .. 'z']|['A' .. 'Z'];
cr = 13;
lf = 10;
all = [0..127];
eol = lf | cr | cr lf ;
not_eol = [all - [cr + lf]];
Tokens
tab = 9;
plus = '+';
dot = '.';
pow = '**';
minus = '-';
mult = '*';
div = '/';
eq = '=';
minuseq = '-=';
diveq = '/=';
exclam = '!';
def = 'def';
equal = '==';
nequal = '!=';
l_par = '(';
r_par = ')';
l_bra = '[';
r_bra = ']';
comma= ',';
qmark = '?';
gqmark = ';';
assert = 'assert';
if = 'if';
while = 'while';
for = 'for';
in = 'in';
print = 'print';
return = 'return';
importkn = 'import';
as = 'as';
from = 'from';
less = '<';
great = '>';
true = 'true';
semi = ':';
false = 'false';
quote = '"';
blank = (' ' | lf | cr);
line_comment = '#' not_eol* eol;
number = digit+ | (digit+ '.' digit+);
id = letter (letter | digit)*;
string = '"'not_eol* '"';
cstring = ''' letter ''';
Ignored Tokens
blank, line_comment;
Productions
program = commands*;
commands = {stmt} statement
| {func} function;
function = def id l_par argument? r_par semi statement;
argument = id eqval? ceidv*;
eqval = eq value;
ceidv = comma id eqval?;
statement = {if} tab* if comparison semi statement
| {while} tab* while comparison semi statement
| {for} tab* for [id1]:id in [id2]:id semi statement
| {return} tab* return expression
| {print} tab* print expression comexpr*
| {assign} tab* id eq expression
| {minassign} tab* id minuseq expression
| {divassign} tab* id diveq expression
| {list} tab* id l_bra [ex1]:expression r_bra eq [ex2]:expression
| {fcall} tab* functioncall
| {import} import;
comexpr = comma expression;
expression = {multiplication} multiplication
| {addition} expression plus multiplication
| {subtraction} expression minus multiplication;
topower = {smth} something
| {power} topower pow something;
something = {id} id
| {parexp} l_par expression r_par
| {fcall} functioncall
| {value} value
| {list} id l_bra expression r_bra
| {assert} assert expression comexpr?
| {other} l_bra value comval* r_bra;
comval = comma value;
multiplication = {power} topower
| {multiplication} multiplication mult topower
| {division} multiplication div topower;
import = {import} importkn module asid? comod*
| {from} from module importkn id asid? comid*;
asid = as id;
comod = comma module asid?;
comid = comma id asid?;
module = idot* id;
idot = id dot;
comparison = {true} true
| {false} false
| {greater} [ex1]:expression great [ex2]:expression
| {lesser} [ex1]:expression less [ex2]:expression
| {equals} [ex1]:expression equal [ex2]:expression
| {nequals} [ex1]:expression nequal [ex2]:expression;
functioncall = id l_par arglist? r_par;
arglist = expression comexpr*;
value = {fcall} id dot functioncall
| {numb} number
| {str} string
| {cstr} cstring;
现在的shift/reduce冲突是:
shift/reduce conflict in state [stack: TIf PTopower *] on TPow in {
[ PMultiplication - PTopower * ] followed by TPow (reduce),
[ PTopower = PTopower * TPow PSomething ] (shift)
}
(注意:这个答案是从原始语法中得出的,而不是从删除左递归的尝试中得出的,这还有其他问题。没有必要从提供给 LALR 的语法中删除左递归(1) 像 SableCC 这样的解析器生成器。)
的确,基本问题是制作:
something = {assert} assert expression comexpr?
这个产生式很奇怪,部分原因是非终结符的名称 ("something") 没有提供任何关于它是什么的提示,但主要是因为人们通常认为 assert expression
是语句,而不是表达式的一部分。而 something
显然是从 expression
:
expression = multiplication
multiplication = topower
topower = something
但是 assert
生产以 expression
结束。这导致了歧义,因为
assert 4 + 3
可以解析为:(为了简洁省略了一些步骤):
expression = expression plus multiplication
| | |
V | |
something | |
| | |
V | |
assert expression | |
| | | |
| V V V
assert 4 + 3
或者,更自然地,如:
expression = something
|
V
assert expression
| |
| V
| expression plus multiplication
| | | |
| V V V
assert 4 + 3
第一次解析似乎不太可能,因为 assert
(据我猜测)实际上 return 不是一个值。 (虽然如果运算符是比较而不是加法,第二个会更自然。)
在没有看到您要解析的语言的定义的情况下,我无法就如何解决此问题提供具体建议,但我倾向于 assert
声明,并且将 something
重命名为更具描述性的名称("term" 很常见,尽管我通常使用 "atom")。
我应该使用 Sablecc 为 MiniPython 编写一个 .grammar 文件。我遇到了这些 shift/reduce 冲突:
shift/reduce conflict in state [stack: TIf PTpower *] on TMult in {
[ PMltp = * TMult PTopower Mltp ] (shift)
[ PMlpt = * ] followed by TMult (reduce)
}
shift/reduce conflict in state [stack: TIf PTopower *] on TDiv in {
[ PMltp = * TDiv PTopower Mltp ] (shift)
[ PMltp = * ] followed by TDiv (reduce)
}
一些代币是:
id = letter (letter | digit)*;
digit = ['0' .. '9'];
letter = ['a' .. 'z']|['A' .. 'Z'];
pow = '**';
mult = '*';
div = '/';
plus = '+';
minus = '-';
assert = 'assert';
l_par = '(';
r_par = ')';
l_bra = '[';
r_bra = ']';
我的 .grammar 文件的一部分是这样的:
expression = multiplication exprsn;
exprsn = {addition} plus multiplication exprsn
| {subtraction} minus multiplication exprsn
| {empty};
topower = something tpwr;
tpwr = {topower} pow something tpwr
| {empty};
multiplication = topower mltp;
mltp = {multiplication} mult topower mltp
| {division} div topower mltp
| {empty};
something = {id} id
| {parexp} l_par expression r_par
| {fcall} functioncall
| {value} value
| {list} id l_bra expression r_bra
| {other} l_bra value comval* r_bra
| {assert} assert expression comexpr?;
comexpr = comma expression;
这是我尝试消除左递归后的语法。我注意到如果我从 something
生产中删除 assert
规则,我就不会发生冲突。此外,从 exprsn
、tpwr
和 mltp
规则中删除 {empty}
规则不会给我带来任何冲突,但我认为这不是解决此问题的正确方法。
如有任何提示,我们将不胜感激。
更新:这是整个语法,在按照要求删除左递归之前:
Package minipython;
Helpers
digit = ['0' .. '9'];
letter = ['a' .. 'z']|['A' .. 'Z'];
cr = 13;
lf = 10;
all = [0..127];
eol = lf | cr | cr lf ;
not_eol = [all - [cr + lf]];
Tokens
tab = 9;
plus = '+';
dot = '.';
pow = '**';
minus = '-';
mult = '*';
div = '/';
eq = '=';
minuseq = '-=';
diveq = '/=';
exclam = '!';
def = 'def';
equal = '==';
nequal = '!=';
l_par = '(';
r_par = ')';
l_bra = '[';
r_bra = ']';
comma= ',';
qmark = '?';
gqmark = ';';
assert = 'assert';
if = 'if';
while = 'while';
for = 'for';
in = 'in';
print = 'print';
return = 'return';
importkn = 'import';
as = 'as';
from = 'from';
less = '<';
great = '>';
true = 'true';
semi = ':';
false = 'false';
quote = '"';
blank = (' ' | lf | cr);
line_comment = '#' not_eol* eol;
number = digit+ | (digit+ '.' digit+);
id = letter (letter | digit)*;
string = '"'not_eol* '"';
cstring = ''' letter ''';
Ignored Tokens
blank, line_comment;
Productions
program = commands*;
commands = {stmt} statement
| {func} function;
function = def id l_par argument? r_par semi statement;
argument = id eqval? ceidv*;
eqval = eq value;
ceidv = comma id eqval?;
statement = {if} tab* if comparison semi statement
| {while} tab* while comparison semi statement
| {for} tab* for [id1]:id in [id2]:id semi statement
| {return} tab* return expression
| {print} tab* print expression comexpr*
| {assign} tab* id eq expression
| {minassign} tab* id minuseq expression
| {divassign} tab* id diveq expression
| {list} tab* id l_bra [ex1]:expression r_bra eq [ex2]:expression
| {fcall} tab* functioncall
| {import} import;
comexpr = comma expression;
expression = {multiplication} multiplication
| {addition} expression plus multiplication
| {subtraction} expression minus multiplication;
topower = {smth} something
| {power} topower pow something;
something = {id} id
| {parexp} l_par expression r_par
| {fcall} functioncall
| {value} value
| {list} id l_bra expression r_bra
| {assert} assert expression comexpr?
| {other} l_bra value comval* r_bra;
comval = comma value;
multiplication = {power} topower
| {multiplication} multiplication mult topower
| {division} multiplication div topower;
import = {import} importkn module asid? comod*
| {from} from module importkn id asid? comid*;
asid = as id;
comod = comma module asid?;
comid = comma id asid?;
module = idot* id;
idot = id dot;
comparison = {true} true
| {false} false
| {greater} [ex1]:expression great [ex2]:expression
| {lesser} [ex1]:expression less [ex2]:expression
| {equals} [ex1]:expression equal [ex2]:expression
| {nequals} [ex1]:expression nequal [ex2]:expression;
functioncall = id l_par arglist? r_par;
arglist = expression comexpr*;
value = {fcall} id dot functioncall
| {numb} number
| {str} string
| {cstr} cstring;
现在的shift/reduce冲突是:
shift/reduce conflict in state [stack: TIf PTopower *] on TPow in {
[ PMultiplication - PTopower * ] followed by TPow (reduce),
[ PTopower = PTopower * TPow PSomething ] (shift)
}
(注意:这个答案是从原始语法中得出的,而不是从删除左递归的尝试中得出的,这还有其他问题。没有必要从提供给 LALR 的语法中删除左递归(1) 像 SableCC 这样的解析器生成器。)
的确,基本问题是制作:
something = {assert} assert expression comexpr?
这个产生式很奇怪,部分原因是非终结符的名称 ("something") 没有提供任何关于它是什么的提示,但主要是因为人们通常认为 assert expression
是语句,而不是表达式的一部分。而 something
显然是从 expression
:
expression = multiplication
multiplication = topower
topower = something
但是 assert
生产以 expression
结束。这导致了歧义,因为
assert 4 + 3
可以解析为:(为了简洁省略了一些步骤):
expression = expression plus multiplication
| | |
V | |
something | |
| | |
V | |
assert expression | |
| | | |
| V V V
assert 4 + 3
或者,更自然地,如:
expression = something
|
V
assert expression
| |
| V
| expression plus multiplication
| | | |
| V V V
assert 4 + 3
第一次解析似乎不太可能,因为 assert
(据我猜测)实际上 return 不是一个值。 (虽然如果运算符是比较而不是加法,第二个会更自然。)
在没有看到您要解析的语言的定义的情况下,我无法就如何解决此问题提供具体建议,但我倾向于 assert
声明,并且将 something
重命名为更具描述性的名称("term" 很常见,尽管我通常使用 "atom")。