算术表达式的 JavaCC 解析器示例

Parser example of JavaCC for arithmetic expressions

这是我在解析区的第一天。作为我在JavaCC中的第一个例子,代码是

SKIP:  { " " | "\t" | "\n" | "\r"                    }
TOKEN: { "(" | ")" | "+" | "*" | <NUM: (["0"-"9"])+> }

void S(): {} { E() <EOF>           }
void E(): {} { T() ("+" T())*      }
void T(): {} { F() ("*" F())*      }
void F(): {} { <NUM> | "(" E() ")" }

我想知道为什么它可以处理像int+(int+int)这样的情况。通过它的算法,我认为它将把这个表达式解析为 [int] & [(int] & [int)] as Ts 并且解析失败。为什么它得到了预期的解析?

将输入字符串“1+(2+3)”视为一个标记序列

<NUM> "+" "(" <NUM> "+" <NUM> ")" <EOF>

这个序列可以从 S() 中导出,如下所示。这 。显示哪些令牌已被消耗。随着令牌被消耗,.向右移动

         . S() 
           ~~~ expand
     ==> . E() <EOF>
           ~~~ expand
     ==> . T() ("+" T())* <EOF>
           ~~~ expand
     ==> . F() ("*" F())* ("+" T())* <EOF>
           ~~~ expand
     ==> . (<NUM> | "(" E() ")") ("*" F())* ("+" T())* <EOF>
           ~~~~~~~~~~~~~~~~~~~~~ choose first and consume
     ==> <NUM> . ("*" F())* ("+" T())* <EOF>
                 ~~~~~~~~~~ delete
     ==> <NUM> . ("+" T())* <EOF>
                 ~~~~~~~~~~ unroll and consume
     ==> <NUM> "+" . T() ("+" T())* <EOF>
                     ~~~ expand
     ==> <NUM> "+" . F() ("*" F())* ("+" T())* <EOF>
                     ~~~~
     ==> <NUM> "+" . (<NUM> | "(" E() ")") ("*" F())* ("+" T())* <EOF>
                     ~~~~~~~~~~~~~~~~~~~~~ choose second and consume
     ==> <NUM> "+" "(" . E() ")" ("*" F())* ("+" T())* <EOF>
                         ~~ expand
     ==> <NUM> "+" "(" . T() ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                         ~~~ expand
     ==> <NUM> "+" "(" . F() ("*" F())* ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                         ~~~ expand
     ==> <NUM> "+" "(" . (<NUM> | "(" E() ")") ("*" F())* ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                         ~~~~~~~~~~~~~~~~~~~~~ choose first and consume
     ==> <NUM> "+" "(" <NUM> . ("*" F())* ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                               ~~~~~~~~~~ delete
     ==> <NUM> "+" "(" <NUM> . ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                               ~~~~~~~~~~~~~~ unroll and consume
     ==> <NUM> "+" "(" <NUM> "+" . T() ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                                   ~~~ expand
     ==> <NUM> "+" "(" <NUM> "+" . F() ("*" F())* ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                                   ~~~ expand
     ==> <NUM> "+" "(" <NUM> "+" . (<NUM> | "(" E() ")") ("*" F())* ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                                   ~~~~~~~~~~~~~~~~~~~~~ choose first and consume
     ==> <NUM> "+" "(" <NUM> "+" <NUM> . ("*" F())* ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                                         ~~~~~~~~~~ delete
     ==> <NUM> "+" "(" <NUM> "+" <NUM> . ("+" T())* ")" ("*" F())* ("+" T())* <EOF>
                                         ~~~~~~~~~~ delete and consume
     ==> <NUM> "+" "(" <NUM> "+" <NUM> ")" . ("*" F())* ("+" T())* <EOF>
                                             ~~~~~~~~~~ delete
     ==> <NUM> "+" "(" <NUM> "+" <NUM> ")" . ("+" T())* <EOF>
                                             ~~~~~~~~~~ delete and consume
     ==> <NUM> "+" "(" <NUM> "+" <NUM> ")" <EOF> .

键:

  • 扩展:用非终结符的定义替换它。
  • 选择:将 (S|T) 替换为 S 或 T
  • 展开:将 (S)* 替换为 S (S)*
  • 删除:将 (S)* 替换为空

上面的推导是left-right推导。我选择显示 left-right 推导,因为它反映了 JavaCC 的工作方式。