pyparsing - 如何使用比较运算符解析字符串?
pyparsing - How to parse string with comparison operators?
所以,我有一个NumericStringParser
class(摘自here),定义如下:
from __future__ import division
from pyparsing import Literal, CaselessLiteral, Word, Combine, Group, Optional, ZeroOrMore, Forward, nums, alphas, oneOf, ParseException
import math
import operator
class NumericStringParser(object):
def __push_first__(self, strg, loc, toks):
self.exprStack.append(toks[0])
def __push_minus__(self, strg, loc, toks):
if toks and toks[0] == "-":
self.exprStack.append("unary -")
def __init__(self):
point = Literal(".")
e = CaselessLiteral("E")
fnumber = Combine(Word("+-" + nums, nums) +
Optional(point + Optional(Word(nums))) +
Optional(e + Word("+-" + nums, nums)))
ident = Word(alphas, alphas + nums + "_$")
plus = Literal("+")
minus = Literal("-")
mult = Literal("*")
floordiv = Literal("//")
div = Literal("/")
mod = Literal("%")
lpar = Literal("(").suppress()
rpar = Literal(")").suppress()
addop = plus | minus
multop = mult | floordiv | div | mod
expop = Literal("^")
pi = CaselessLiteral("PI")
tau = CaselessLiteral("TAU")
expr = Forward()
atom = ((Optional(oneOf("- +")) +
(ident + lpar + expr + rpar | pi | e | tau | fnumber).setParseAction(self.__push_first__))
| Optional(oneOf("- +")) + Group(lpar + expr + rpar)
).setParseAction(self.__push_minus__)
factor = Forward()
factor << atom + \
ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
term = factor + \
ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
expr << term + \
ZeroOrMore((addop + term).setParseAction(self.__push_first__))
self.bnf = expr
self.opn = {
"+": operator.add,
"-": operator.sub,
"*": operator.mul,
"/": operator.truediv,
"//": operator.floordiv,
"%": operator.mod,
"^": operator.pow,
"=": operator.eq,
"!=": operator.ne,
"<=": operator.le,
">=": operator.ge,
"<": operator.lt,
">": operator.gt
}
self.fn = {
"sin": math.sin,
"cos": math.cos,
"tan": math.tan,
"asin": math.asin,
"acos": math.acos,
"atan": math.atan,
"exp": math.exp,
"abs": abs,
"sqrt": math.sqrt,
"floor": math.floor,
"ceil": math.ceil,
"trunc": math.trunc,
"round": round,
"fact": factorial,
"gamma": math.gamma
}
def __evaluate_stack__(self, s):
op = s.pop()
if op == "unary -":
return -self.__evaluate_stack__(s)
if op in ("+", "-", "*", "//", "/", "^", "%", "!=", "<=", ">=", "<", ">", "="):
op2 = self.__evaluate_stack__(s)
op1 = self.__evaluate_stack__(s)
return self.opn[op](op1, op2)
if op == "PI":
return math.pi
if op == "E":
return math.e
if op == "PHI":
return phi
if op == "TAU":
return math.tau
if op in self.fn:
return self.fn[op](self.__evaluate_stack__(s))
if op[0].isalpha():
raise NameError(f"{op} is not defined.")
return float(op)
我有一个evaluate()
函数,定义如下:
def evaluate(expression, parse_all=True):
nsp = NumericStringParser()
nsp.exprStack = []
try:
nsp.bnf.parseString(expression, parse_all)
except ParseException as error:
raise SyntaxError(error)
return nsp.__evaluate_stack__(nsp.exprStack[:])
evaluate()
是一个函数,它会解析一个字符串来计算一个数学运算,例如:
>>> evaluate("5+5")
10
>>> evaluate("5^2+1")
26
问题是它无法计算比较运算符(=
、!=
、<
、>
、<=
、>=
),当我尝试:evaluate("5=5")
时,它抛出 SyntaxError: Expected end of text (at char 1), (line:1, col:2)
而不是返回 True
。该函数如何计算这六个比较运算符?
正如@rici 所指出的,您添加了评估部分,但没有添加解析部分。
解析器在这些行中定义:
factor = atom + \
ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
term = factor + \
ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
expr <<= term + \
ZeroOrMore((addop + term).setParseAction(self.__push_first__))
这些语句的顺序很重要,因为它们会使解析器识别操作的优先级,这是您在高中数学中学到的。即乘幂次之,乘除次之,加减次之。
您需要按照相同的模式将您的关系运算符插入到此解析器定义中。添加后,C语言运算符优先级的约定(我找到了这个参考 - https://www.tutorialspoint.com/cprogramming/c_operators_precedence.htm)是:
relational operations - <=, >=, >, <
equality operations - ==, !=
在您的情况下,您选择使用“=”而不是“==”,在此设置中应该没问题。我建议您使用 pyparsing 的 oneOf
帮助器来定义这些运算符组,因为它会处理短字符串可能会掩盖较长字符串的情况(就像您之前 post).
请注意,通过将这些操作全部混合到一个表达式解析器中,您将得到类似 5 + 2 > 3
的结果。由于'>'具有较低的优先级,将首先评估 5+2 给出 7,然后评估 7 > 3,并且 operator.__gt__
将 return 1 或 0。
将此示例扩展到其他运算符的困难是导致我在 pyparsing 中编写 infixNotation
辅助方法的原因。你可能想看一看。
编辑:
您问过如何使用 Literal('<=') | Literal('>=) | etc.
,正如您所写的那样,它会工作得很好。您只需要注意在较短的操作符之前寻找较长的操作符。如果你写 Literal('>') | Literal('>=') | ...
那么匹配 '>=' 会失败,因为第一个匹配会匹配 '>' 然后你会剩下 '='。使用 oneOf
会为您解决这个问题。
要添加额外的解析器步骤,您只需对最后一级执行 expr <<= ...
步骤。再看看语句的模式。将 expr <<= term + etc.
更改为 arith_expr = term + etc.
,按照它添加 relational_expr
和 equality_expr
的级别,然后以 expr <<= equality_expr
.
结束
此模式基于:
factor := atom (^ atom)...
term := factor (mult_op factor)...
arith_expr := term (add_op term)...
relation_expr := arith_expr (relation_op arith_expr)...
equality_expr := relation_expr (equality_op relation_expr)...
尝试自己转换为 Python/pyparsing。
factor << atom + \
ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
term = factor + \
ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
arith_expr = term + \
ZeroOrMore((addop + term).setParseAction(self.__push_first__))
relational = arith_expr + \
ZeroOrMore((diffop + arith_expr).setParseAction(self.__push_first__))
expr <<= relational + \
ZeroOrMore((compop + relational).setParseAction(self.__push_first__))
所以我测试了一下,它有效!非常感谢 PaulMcG! :)
所以,我有一个NumericStringParser
class(摘自here),定义如下:
from __future__ import division
from pyparsing import Literal, CaselessLiteral, Word, Combine, Group, Optional, ZeroOrMore, Forward, nums, alphas, oneOf, ParseException
import math
import operator
class NumericStringParser(object):
def __push_first__(self, strg, loc, toks):
self.exprStack.append(toks[0])
def __push_minus__(self, strg, loc, toks):
if toks and toks[0] == "-":
self.exprStack.append("unary -")
def __init__(self):
point = Literal(".")
e = CaselessLiteral("E")
fnumber = Combine(Word("+-" + nums, nums) +
Optional(point + Optional(Word(nums))) +
Optional(e + Word("+-" + nums, nums)))
ident = Word(alphas, alphas + nums + "_$")
plus = Literal("+")
minus = Literal("-")
mult = Literal("*")
floordiv = Literal("//")
div = Literal("/")
mod = Literal("%")
lpar = Literal("(").suppress()
rpar = Literal(")").suppress()
addop = plus | minus
multop = mult | floordiv | div | mod
expop = Literal("^")
pi = CaselessLiteral("PI")
tau = CaselessLiteral("TAU")
expr = Forward()
atom = ((Optional(oneOf("- +")) +
(ident + lpar + expr + rpar | pi | e | tau | fnumber).setParseAction(self.__push_first__))
| Optional(oneOf("- +")) + Group(lpar + expr + rpar)
).setParseAction(self.__push_minus__)
factor = Forward()
factor << atom + \
ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
term = factor + \
ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
expr << term + \
ZeroOrMore((addop + term).setParseAction(self.__push_first__))
self.bnf = expr
self.opn = {
"+": operator.add,
"-": operator.sub,
"*": operator.mul,
"/": operator.truediv,
"//": operator.floordiv,
"%": operator.mod,
"^": operator.pow,
"=": operator.eq,
"!=": operator.ne,
"<=": operator.le,
">=": operator.ge,
"<": operator.lt,
">": operator.gt
}
self.fn = {
"sin": math.sin,
"cos": math.cos,
"tan": math.tan,
"asin": math.asin,
"acos": math.acos,
"atan": math.atan,
"exp": math.exp,
"abs": abs,
"sqrt": math.sqrt,
"floor": math.floor,
"ceil": math.ceil,
"trunc": math.trunc,
"round": round,
"fact": factorial,
"gamma": math.gamma
}
def __evaluate_stack__(self, s):
op = s.pop()
if op == "unary -":
return -self.__evaluate_stack__(s)
if op in ("+", "-", "*", "//", "/", "^", "%", "!=", "<=", ">=", "<", ">", "="):
op2 = self.__evaluate_stack__(s)
op1 = self.__evaluate_stack__(s)
return self.opn[op](op1, op2)
if op == "PI":
return math.pi
if op == "E":
return math.e
if op == "PHI":
return phi
if op == "TAU":
return math.tau
if op in self.fn:
return self.fn[op](self.__evaluate_stack__(s))
if op[0].isalpha():
raise NameError(f"{op} is not defined.")
return float(op)
我有一个evaluate()
函数,定义如下:
def evaluate(expression, parse_all=True):
nsp = NumericStringParser()
nsp.exprStack = []
try:
nsp.bnf.parseString(expression, parse_all)
except ParseException as error:
raise SyntaxError(error)
return nsp.__evaluate_stack__(nsp.exprStack[:])
evaluate()
是一个函数,它会解析一个字符串来计算一个数学运算,例如:
>>> evaluate("5+5")
10
>>> evaluate("5^2+1")
26
问题是它无法计算比较运算符(=
、!=
、<
、>
、<=
、>=
),当我尝试:evaluate("5=5")
时,它抛出 SyntaxError: Expected end of text (at char 1), (line:1, col:2)
而不是返回 True
。该函数如何计算这六个比较运算符?
正如@rici 所指出的,您添加了评估部分,但没有添加解析部分。
解析器在这些行中定义:
factor = atom + \
ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
term = factor + \
ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
expr <<= term + \
ZeroOrMore((addop + term).setParseAction(self.__push_first__))
这些语句的顺序很重要,因为它们会使解析器识别操作的优先级,这是您在高中数学中学到的。即乘幂次之,乘除次之,加减次之。
您需要按照相同的模式将您的关系运算符插入到此解析器定义中。添加后,C语言运算符优先级的约定(我找到了这个参考 - https://www.tutorialspoint.com/cprogramming/c_operators_precedence.htm)是:
relational operations - <=, >=, >, <
equality operations - ==, !=
在您的情况下,您选择使用“=”而不是“==”,在此设置中应该没问题。我建议您使用 pyparsing 的 oneOf
帮助器来定义这些运算符组,因为它会处理短字符串可能会掩盖较长字符串的情况(就像您之前 post).
请注意,通过将这些操作全部混合到一个表达式解析器中,您将得到类似 5 + 2 > 3
的结果。由于'>'具有较低的优先级,将首先评估 5+2 给出 7,然后评估 7 > 3,并且 operator.__gt__
将 return 1 或 0。
将此示例扩展到其他运算符的困难是导致我在 pyparsing 中编写 infixNotation
辅助方法的原因。你可能想看一看。
编辑:
您问过如何使用 Literal('<=') | Literal('>=) | etc.
,正如您所写的那样,它会工作得很好。您只需要注意在较短的操作符之前寻找较长的操作符。如果你写 Literal('>') | Literal('>=') | ...
那么匹配 '>=' 会失败,因为第一个匹配会匹配 '>' 然后你会剩下 '='。使用 oneOf
会为您解决这个问题。
要添加额外的解析器步骤,您只需对最后一级执行 expr <<= ...
步骤。再看看语句的模式。将 expr <<= term + etc.
更改为 arith_expr = term + etc.
,按照它添加 relational_expr
和 equality_expr
的级别,然后以 expr <<= equality_expr
.
此模式基于:
factor := atom (^ atom)...
term := factor (mult_op factor)...
arith_expr := term (add_op term)...
relation_expr := arith_expr (relation_op arith_expr)...
equality_expr := relation_expr (equality_op relation_expr)...
尝试自己转换为 Python/pyparsing。
factor << atom + \
ZeroOrMore((expop + factor).setParseAction(self.__push_first__))
term = factor + \
ZeroOrMore((multop + factor).setParseAction(self.__push_first__))
arith_expr = term + \
ZeroOrMore((addop + term).setParseAction(self.__push_first__))
relational = arith_expr + \
ZeroOrMore((diffop + arith_expr).setParseAction(self.__push_first__))
expr <<= relational + \
ZeroOrMore((compop + relational).setParseAction(self.__push_first__))
所以我测试了一下,它有效!非常感谢 PaulMcG! :)