解析简单的 Java 文件的 ANTLR 语法有什么问题?

What is wrong with my ANTLR grammar for parsing a simplistic Java file?

ANTL语法:

grammar Java;

// Parser

compilationUnit: classDeclaration;

classDeclaration : 'class' CLASS_NAME classBlock
  ;

classBlock: OPEN_BLOCK method* CLOSE_BLOCK
  ;

method: methodReturnValue methodName methodArgs methodBlock
  ;

methodReturnValue: CLASS_NAME
  ;

methodName: METHOD_NAME
  ;

methodArgs: OPEN_PAREN CLOSE_PAREN
  ;

methodBlock: OPEN_BLOCK CLOSE_BLOCK
  ;

// Lexer

CLASS_NAME: ALPHA;
METHOD_NAME: ALPHA;

WS: [ \t\n] -> skip;

OPEN_BLOCK: '{';
CLOSE_BLOCK: '}';

OPEN_PAREN: '(';
CLOSE_PAREN: ')';

fragment ALPHA: [a-zA-Z][a-zA-Z0-9]*;

伪Java文件:

class Test {

    void run() { }

}

除 METHOD_NAME 外,大多数情况都匹配,它错误地与 methodArgs 相关联。

line 3:6 mismatched input 'run' expecting METHOD_NAME

这是关于令牌歧义的。过去几周,这个问题已经被问过好几次了。请点击链接,尤其是 disambiguate,在 .

一旦出现 mismatched 错误,将 -tokens 添加到 grun 以显示标记,这有助于找出您认为词法分析器将执行的操作与执行的操作之间的差异它确实如此。用你的语法 :

CLASS_NAME: ALPHA;
METHOD_NAME: ALPHA;

ALPHA 匹配的每个输入都是有歧义的,如果有歧义,ANTLR 会选择第一条规则。

$ grun Question compilationUnit -tokens -diagnostics t.text 
[@0,0:4='class',<'class'>,1:0]
[@1,6:9='Test',<CLASS_NAME>,1:6]
[@2,11:11='{',<'{'>,1:11]
[@3,18:21='void',<CLASS_NAME>,3:4]
[@4,23:25='run',<CLASS_NAME>,3:9]
[@5,26:26='(',<'('>,3:12]
[@6,27:27=')',<')'>,3:13]
[@7,29:29='{',<'{'>,3:15]
[@8,31:31='}',<'}'>,3:17]
[@9,34:34='}',<'}'>,5:0]
[@10,36:35='<EOF>',<EOF>,6:0]
Question last update 0841
line 3:9 mismatched input 'run' expecting METHOD_NAME

因为 run 已被解释为 CLASS_NAME

我会这样写语法:

grammar Question;

// Parser

compilationUnit
@init {System.out.println("Question last update 0919");}
    : classDeclaration;

classDeclaration : 'class' ID classBlock
  ;

classBlock: OPEN_BLOCK method* CLOSE_BLOCK
  ;

method: methodReturnValue=ID methodName=ID methodArgs methodBlock
        {System.out.println("Method found : " + $methodName.text + 
                            " which returns a " + $methodReturnValue.text);}
  ;

methodArgs: OPEN_PAREN CLOSE_PAREN
  ;

methodBlock: OPEN_BLOCK CLOSE_BLOCK
  ;

// Lexer

ID : ALPHA ( ALPHA | DIGIT | '_' )* ;

WS: [ \t\n] -> skip;

OPEN_BLOCK: '{';
CLOSE_BLOCK: '}';

OPEN_PAREN: '(';
CLOSE_PAREN: ')';

fragment ALPHA : [a-zA-Z] ;
fragment DIGIT : [0-9] ;

执行:

$ grun Question compilationUnit -tokens -diagnostics t.text 
[@0,0:4='class',<'class'>,1:0]
[@1,6:9='Test',<ID>,1:6]
[@2,11:11='{',<'{'>,1:11]
[@3,18:21='void',<ID>,3:4]
[@4,23:25='run',<ID>,3:9]
[@5,26:26='(',<'('>,3:12]
[@6,27:27=')',<')'>,3:13]
[@7,29:29='{',<'{'>,3:15]
[@8,31:31='}',<'}'>,3:17]
[@9,34:34='}',<'}'>,5:0]
[@10,36:35='<EOF>',<EOF>,6:0]
Question last update 0919
Method found : run which returns a void

$ grun Question compilationUnit -gui t.text

methodReturnValuemethodNamectx 的侦听器中可用,规则上下文。