Python PLY 问题与 if-else 和 while 语句
Python PLY issue with if-else and while statements
if 语句和 while 语句不断从 p_error(p) 中抛出语法错误,PLY 告诉我在运行时存在冲突。这些问题来自 if-else 和 while 语句,因为在添加它们之前没问题。任何帮助将不胜感激。
如果可能,请不要对实现进行太多更改,即使它的做法不佳。我只是想帮助理解它我不想彻底检修(那是剽窃)。
import ply.lex as lex
import ply.yacc as yacc
# === Lexical tokens component ===
# List of possible token namesthat can be produced by the lexer
# NAME: variable name, L/RPAREN: Left/Right Parenthesis
tokens = (
'NAME', 'NUMBER',
'PLUS', 'MINUS', 'TIMES', 'DIVIDE', 'MODULO', 'EQUALS',
'LPAREN', 'RPAREN',
'IF', 'ELSE', 'WHILE',
'EQUAL', 'NOTEQ', 'LARGE', 'SMALL', 'LRGEQ', 'SMLEQ',
)
# Regular expression rules for tokens format: t_<TOKEN>
# Simple tokens: regex for literals +,-,*,/,%,=,(,) and variable names (alphanumeric)
t_PLUS = r'\+'
t_MINUS = r'-'
t_TIMES = r'\*'
t_DIVIDE = r'/'
t_MODULO = r'%'
t_EQUALS = r'='
t_LPAREN = r'\('
t_RPAREN = r'\)'
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
t_IF = r'if'
t_ELSE = r'else'
t_WHILE = r'while'
t_EQUAL = r'\=\='
t_NOTEQ = r'\!\='
t_LARGE = r'\>'
t_SMALL = r'\<'
t_LRGEQ = r'\>\='
t_SMLEQ = r'\<\='
# complex tokens
# number token
def t_NUMBER(t):
r'\d+' # digit special character regex
t.value = int(t.value) # convert str -> int
return t
# Ignored characters
t_ignore = " \t" # spaces & tabs regex
# newline character
def t_newline(t):
r'\n+' # newline special character regex
t.lexer.lineno += t.value.count("\n") # increase current line number accordingly
# error handling for invalid character
def t_error(t):
print("Illegal character '%s'" % t.value[0]) # print error message with causing character
t.lexer.skip(1) # skip invalid character
# Build the lexer
lex.lex()
# === Yacc parsing/grammar component ===
# Precedence & associative rules for the arithmetic operators
# 1. Unary, right-associative minus.
# 2. Binary, left-associative multiplication, division, and modulus
# 3. Binary, left-associative addition and subtraction
# Parenthesis precedence defined through the grammar
precedence = (
('left', 'PLUS', 'MINUS'),
('left', 'TIMES', 'DIVIDE', 'MODULO'),
('right', 'UMINUS'),
)
# dictionary of names (for storing variables)
names = {}
# --- Grammar:
# <statement> -> NAME = <expression> | <expression>
# <expression> -> <expression> + <expression>
# | <expression> - <expression>
# | <expression> * <expression>
# | <expression> / <expression>
# | <expression> % <expression>
# | - <expression>
# | ( <expression> )
# | NUMBER
# | NAME
# ---
# defined below using function definitions with format string/comment
# followed by logic of changing state of engine
# if statement
def p_statement_if(p):
'''statement : IF LPAREN comparison RPAREN statement
| IF LPAREN comparison RPAREN statement ELSE statement'''
if p[3]:
p[0] = p[5]
else:
if p[7] is not None:
p[0] = p[7]
def p_statement_while(p):
'statement : WHILE LPAREN comparison RPAREN statement'
while(p[3]):
p[5];
# assignment statement: <statement> -> NAME = <expression>
def p_statement_assign(p):
'statement : NAME EQUALS expression'
names[p[1]] = p[3] # PLY engine syntax, p stores parser engine state
# expression statement: <statement> -> <expression>
def p_statement_expr(p):
'statement : expression'
print(p[1])
# comparison
def p_comparison_binop(p):
'''comparison : expression EQUAL expression
| expression NOTEQ expression
| expression LARGE expression
| expression SMALL expression
| expression LRGEQ expression
| expression SMLEQ expression'''
if p[2] == '==':
p[0] = p[1] == p[3]
elif p[2] == '!=':
p[0] = p[1] != p[3]
elif p[2] == '>':
p[0] = p[1] > p[3]
elif p[2] == '<':
p[0] = p[1] < p[3]
elif p[2] == '>=':
p[0] = p[1] >= p[3]
elif p[2] == '<=':
p[0] = p[1] <= p[3]
# binary operator expression: <expression> -> <expression> + <expression>
# | <expression> - <expression>
# | <expression> * <expression>
# | <expression> / <expression>
# | <expression> % <expression>
def p_expression_binop(p):
'''expression : expression PLUS expression
| expression MINUS expression
| expression TIMES expression
| expression DIVIDE expression
| expression MODULO expression'''
if p[2] == '+':
p[0] = p[1] + p[3]
elif p[2] == '-':
p[0] = p[1] - p[3]
elif p[2] == '*':
p[0] = p[1] * p[3]
elif p[2] == '/':
p[0] = p[1] / p[3]
elif p[2] == '%':
p[0] = p[1] % p[3]
# unary minus operator expression: <expression> -> - <expression>
def p_expression_uminus(p):
'expression : MINUS expression %prec UMINUS'
p[0] = -p[2]
# parenthesis group expression: <expression> -> ( <expression> )
def p_expression_group(p):
'expression : LPAREN expression RPAREN'
p[0] = p[2]
# number literal expression: <expression> -> NUMBER
def p_expression_number(p):
'expression : NUMBER'
p[0] = p[1]
# variable name literal expression: <expression> -> NAME
def p_expression_name(p):
'expression : NAME'
# attempt to lookup variable in current dictionary, throw error if not found
try:
p[0] = names[p[1]]
except LookupError:
print("Undefined name '%s'" % p[1])
p[0] = 0
# handle parsing errors
def p_error(p):
print("Syntax error at '%s'" % p.value)
# build parser
yacc.yacc()
# start interpreter and accept input using commandline/console
while True:
try:
s = input('calc > ') # get user input. use raw_input() on Python 2
except EOFError:
break
yacc.parse(s) # parse user input string
您的基本问题是您的词法分析器无法识别关键字 if
和 while
(也不识别 else
),因为在这些情况下会触发 t_NAME
模式. section 4.3 of the Ply documentation 中描述了问题和可能的解决方案。问题是:
Tokens defined by strings are added next by sorting them in order of decreasing regular expression length (longer expressions are added first).
并且 t_NAME
的表达式比简单的关键字模式更长。
您不能通过将 t_NAME
变成一个词法分析器函数来解决这个问题,因为函数定义的标记在字符串定义的标记之前被检查。
但是你可以把t_NAME
做成一个函数,在函数中查字典匹配到的字符串是不是保留字。 (请参阅链接部分末尾的示例,在 "To handle reserved words..." 开头的段落中)。当你这样做时,你根本没有定义 t_IF
、t_WHILE
和 t_ELSE
。
shift-reduce 冲突是 "dangling else" 的问题。如果您搜索该短语,您会找到各种解决方案。
最简单的解决方案是什么也不做,只是忽略警告,因为默认情况下 Ply 会做正确的事情。
第二个最简单的解决方案是将 ('if', 'IF'), ('left', 'ELSE')
添加到优先级列表,并向 if
产生式添加优先级标记:
'''statement : IF LPAREN comparison RPAREN statement %prec IF
| IF LPAREN comparison RPAREN statement ELSE statement'''
赋予 ELSE
比 IF
更高的优先级值可确保当解析器需要在第二个产生式中移动 ELSE
或在第一个产生式中减少时,它选择移位(因为 ELSE
具有更高的优先级)。事实上,这是默认行为,所以优先级声明根本不会影响解析行为;但是,它会抑制 shift-reduce 冲突警告,因为冲突已解决。
另一种解决方案,请参阅 。
最后,请查看对您的问题的评论。您对 if
和 while
语句的操作根本不起作用。
if 语句和 while 语句不断从 p_error(p) 中抛出语法错误,PLY 告诉我在运行时存在冲突。这些问题来自 if-else 和 while 语句,因为在添加它们之前没问题。任何帮助将不胜感激。
如果可能,请不要对实现进行太多更改,即使它的做法不佳。我只是想帮助理解它我不想彻底检修(那是剽窃)。
import ply.lex as lex
import ply.yacc as yacc
# === Lexical tokens component ===
# List of possible token namesthat can be produced by the lexer
# NAME: variable name, L/RPAREN: Left/Right Parenthesis
tokens = (
'NAME', 'NUMBER',
'PLUS', 'MINUS', 'TIMES', 'DIVIDE', 'MODULO', 'EQUALS',
'LPAREN', 'RPAREN',
'IF', 'ELSE', 'WHILE',
'EQUAL', 'NOTEQ', 'LARGE', 'SMALL', 'LRGEQ', 'SMLEQ',
)
# Regular expression rules for tokens format: t_<TOKEN>
# Simple tokens: regex for literals +,-,*,/,%,=,(,) and variable names (alphanumeric)
t_PLUS = r'\+'
t_MINUS = r'-'
t_TIMES = r'\*'
t_DIVIDE = r'/'
t_MODULO = r'%'
t_EQUALS = r'='
t_LPAREN = r'\('
t_RPAREN = r'\)'
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
t_IF = r'if'
t_ELSE = r'else'
t_WHILE = r'while'
t_EQUAL = r'\=\='
t_NOTEQ = r'\!\='
t_LARGE = r'\>'
t_SMALL = r'\<'
t_LRGEQ = r'\>\='
t_SMLEQ = r'\<\='
# complex tokens
# number token
def t_NUMBER(t):
r'\d+' # digit special character regex
t.value = int(t.value) # convert str -> int
return t
# Ignored characters
t_ignore = " \t" # spaces & tabs regex
# newline character
def t_newline(t):
r'\n+' # newline special character regex
t.lexer.lineno += t.value.count("\n") # increase current line number accordingly
# error handling for invalid character
def t_error(t):
print("Illegal character '%s'" % t.value[0]) # print error message with causing character
t.lexer.skip(1) # skip invalid character
# Build the lexer
lex.lex()
# === Yacc parsing/grammar component ===
# Precedence & associative rules for the arithmetic operators
# 1. Unary, right-associative minus.
# 2. Binary, left-associative multiplication, division, and modulus
# 3. Binary, left-associative addition and subtraction
# Parenthesis precedence defined through the grammar
precedence = (
('left', 'PLUS', 'MINUS'),
('left', 'TIMES', 'DIVIDE', 'MODULO'),
('right', 'UMINUS'),
)
# dictionary of names (for storing variables)
names = {}
# --- Grammar:
# <statement> -> NAME = <expression> | <expression>
# <expression> -> <expression> + <expression>
# | <expression> - <expression>
# | <expression> * <expression>
# | <expression> / <expression>
# | <expression> % <expression>
# | - <expression>
# | ( <expression> )
# | NUMBER
# | NAME
# ---
# defined below using function definitions with format string/comment
# followed by logic of changing state of engine
# if statement
def p_statement_if(p):
'''statement : IF LPAREN comparison RPAREN statement
| IF LPAREN comparison RPAREN statement ELSE statement'''
if p[3]:
p[0] = p[5]
else:
if p[7] is not None:
p[0] = p[7]
def p_statement_while(p):
'statement : WHILE LPAREN comparison RPAREN statement'
while(p[3]):
p[5];
# assignment statement: <statement> -> NAME = <expression>
def p_statement_assign(p):
'statement : NAME EQUALS expression'
names[p[1]] = p[3] # PLY engine syntax, p stores parser engine state
# expression statement: <statement> -> <expression>
def p_statement_expr(p):
'statement : expression'
print(p[1])
# comparison
def p_comparison_binop(p):
'''comparison : expression EQUAL expression
| expression NOTEQ expression
| expression LARGE expression
| expression SMALL expression
| expression LRGEQ expression
| expression SMLEQ expression'''
if p[2] == '==':
p[0] = p[1] == p[3]
elif p[2] == '!=':
p[0] = p[1] != p[3]
elif p[2] == '>':
p[0] = p[1] > p[3]
elif p[2] == '<':
p[0] = p[1] < p[3]
elif p[2] == '>=':
p[0] = p[1] >= p[3]
elif p[2] == '<=':
p[0] = p[1] <= p[3]
# binary operator expression: <expression> -> <expression> + <expression>
# | <expression> - <expression>
# | <expression> * <expression>
# | <expression> / <expression>
# | <expression> % <expression>
def p_expression_binop(p):
'''expression : expression PLUS expression
| expression MINUS expression
| expression TIMES expression
| expression DIVIDE expression
| expression MODULO expression'''
if p[2] == '+':
p[0] = p[1] + p[3]
elif p[2] == '-':
p[0] = p[1] - p[3]
elif p[2] == '*':
p[0] = p[1] * p[3]
elif p[2] == '/':
p[0] = p[1] / p[3]
elif p[2] == '%':
p[0] = p[1] % p[3]
# unary minus operator expression: <expression> -> - <expression>
def p_expression_uminus(p):
'expression : MINUS expression %prec UMINUS'
p[0] = -p[2]
# parenthesis group expression: <expression> -> ( <expression> )
def p_expression_group(p):
'expression : LPAREN expression RPAREN'
p[0] = p[2]
# number literal expression: <expression> -> NUMBER
def p_expression_number(p):
'expression : NUMBER'
p[0] = p[1]
# variable name literal expression: <expression> -> NAME
def p_expression_name(p):
'expression : NAME'
# attempt to lookup variable in current dictionary, throw error if not found
try:
p[0] = names[p[1]]
except LookupError:
print("Undefined name '%s'" % p[1])
p[0] = 0
# handle parsing errors
def p_error(p):
print("Syntax error at '%s'" % p.value)
# build parser
yacc.yacc()
# start interpreter and accept input using commandline/console
while True:
try:
s = input('calc > ') # get user input. use raw_input() on Python 2
except EOFError:
break
yacc.parse(s) # parse user input string
您的基本问题是您的词法分析器无法识别关键字 if
和 while
(也不识别 else
),因为在这些情况下会触发 t_NAME
模式. section 4.3 of the Ply documentation 中描述了问题和可能的解决方案。问题是:
Tokens defined by strings are added next by sorting them in order of decreasing regular expression length (longer expressions are added first).
并且 t_NAME
的表达式比简单的关键字模式更长。
您不能通过将 t_NAME
变成一个词法分析器函数来解决这个问题,因为函数定义的标记在字符串定义的标记之前被检查。
但是你可以把t_NAME
做成一个函数,在函数中查字典匹配到的字符串是不是保留字。 (请参阅链接部分末尾的示例,在 "To handle reserved words..." 开头的段落中)。当你这样做时,你根本没有定义 t_IF
、t_WHILE
和 t_ELSE
。
shift-reduce 冲突是 "dangling else" 的问题。如果您搜索该短语,您会找到各种解决方案。
最简单的解决方案是什么也不做,只是忽略警告,因为默认情况下 Ply 会做正确的事情。
第二个最简单的解决方案是将 ('if', 'IF'), ('left', 'ELSE')
添加到优先级列表,并向 if
产生式添加优先级标记:
'''statement : IF LPAREN comparison RPAREN statement %prec IF
| IF LPAREN comparison RPAREN statement ELSE statement'''
赋予 ELSE
比 IF
更高的优先级值可确保当解析器需要在第二个产生式中移动 ELSE
或在第一个产生式中减少时,它选择移位(因为 ELSE
具有更高的优先级)。事实上,这是默认行为,所以优先级声明根本不会影响解析行为;但是,它会抑制 shift-reduce 冲突警告,因为冲突已解决。
另一种解决方案,请参阅
最后,请查看对您的问题的评论。您对 if
和 while
语句的操作根本不起作用。