调车场验证表达式

Shunting-Yard Validate Expression

我们使用 Shunting-Yard 算法来计算表达式。我们可以通过简单地应用算法来验证表达式。如果缺少操作数、未匹配的括号和其他内容,它将失败。然而,调车场算法具有比人类可读的中缀更大的支持语法。例如,

1 + 2
+ 1 2
1 2 +

提供“1+2”作为调车场算法输入的所有可接受方式。 '+ 1 2' 和 '1 2 +' 不是有效的中缀,但标准的 Shunting-Yard 算法可以处理它们。该算法并不真正关心顺序,它按优先顺序应用运算符,获取 'nearest' 个操作数。

我们希望将我们的输入限制为有效的人类可读中缀。我正在寻找一种方法来修改 Shunting-Yard 算法使其因无效中缀而失败,或者在使用 Shunting-Yard 之前提供中缀验证。

有人知道任何已发布的技术可以做到这一点吗?我们必须同时支持基本运算符、自定义运算符、括号和函数(具有多个参数)。除了基本的在线运算符之外,我还没有看到任何可以使用的东西。

谢谢

有关调车场算法的精彩讨论是 http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm 那里介绍的算法使用运算符堆栈的关键思想,但有一些语法来知道接下来应该期待什么。它有两个主要函数 E() 需要一个表达式和 P() 需要一个前缀运算符、一个变量、一个数字、括号和函数。前缀运算符总是比二元运算符绑定得更紧,因此您想先处理这个问题。

如果我们说 P 代表一些前缀序列而 B 是一个二元运算符那么任何表达式都将是

P B P B P

即你要么期待一个前缀序列,要么期待一个二元运算符。正式的语法是

E -> P (B P)*

P 将是

P -> -P | variable | constant | etc.

这将转换为伪代码

E() {
    P()
    while next token is a binary op:
         read next op
         push onto stack and do the shunting yard logic
         P()
    if any tokens remain report error
    pop remaining operators off the stack
}

P() {
    if next token is constant or variable:
         add to output
    else if next token is unary minus: 
         push uminus onto operator stack
         P()
}

您可以扩展它以处理其他一元运算符、函数、括号、后缀运算符。

我的问题的解决方案是增强 Wikipedia with the state machine recommended by Rici 上发布的算法。我在这里发布伪代码,因为它可能对其他人有用。

Support two states, ExpectOperand and ExpectOperator.

Set State to ExpectOperand
While there are tokens to read:
    If token is a constant (number)
        Error if state is not ExpectOperand.
        Push token to output queue.
        Set state to ExpectOperator.
    If token is a variable.
        Error if state is not ExpectOperand.
        Push token to output queue.
        Set state to ExpectOperator.
    If token is an argument separator (a comma).
        Error if state is not ExpectOperator.
        Until the top of the operator stack is a left parenthesis  (don't pop the left parenthesis).
            Push the top token of the stack to the output queue.
            If no left parenthesis is encountered then error.  Either the separator was misplaced or the parentheses were mismatched.
        Set state to ExpectOperand.
    If token is a unary operator.
        Error if the state is not ExpectOperand.
        Push the token to the operator stack.
        Set the state to ExpectOperand.
    If the token is a binary operator.
        Error if the state is not ExpectOperator.
        While there is an operator token at the top of the operator stack and either the current token is left-associative and of lower then or equal precedence to the operator on the stack, or the current token is right associative and of lower precedence than the operator on the stack.
            Pop the operator from the operator stack and push it onto the output queue.
        Push the current operator onto the operator stack.
        Set the state to ExpectOperand. 
    If the token is a Function.
        Error if the state is not ExpectOperand.  
        Push the token onto the operator stack.
        Set the state to ExpectOperand.
    If the token is a open parentheses.
        Error if the state is not ExpectOperand.
        Push the token onto the operator stack.
        Set the state to ExpectOperand.
    If the token is a close parentheses.
         Error if the state is not ExpectOperator.
         Until the token at the top of the operator stack is a left parenthesis.
             Pop the token off of the operator stack and push it onto the output queue.
         Pop the left parenthesis off of the operator stack and discard.
         If the token at the top of the operator stack is a function then pop it and push it onto the output queue.
         Set the state to ExpectOperator.
At this point you have processed all the input tokens.
While there are tokens on the operator stack.
    Pop the next token from the operator stack and push it onto the output queue.
    If a parenthesis is encountered then error.  There are mismatched parenthesis.

通过查看前面的记号,您可以轻松区分一元运算符和二元运算符(我具体说的是否定前缀和减法运算符)。如果没有前一个token,前一个token是一个左括号,或者前一个token是一个运算符那么你遇到的是一元前缀运算符,否则你遇到的是二元运算符。