自定义错误处理程序方法无法处理令牌识别错误
Custom error handler methods fail to handle token recognition errors
这是我的 .g4 文件:
grammar Hello;
start : compilation;
compilation : sql*;
sql : altercommand;
altercommand : ALTER TABLE SEMICOLON;
ALTER: 'alter';
TABLE: 'table';
SEMICOLON : ';';
我的主要class:
public class Main {
public static void main(String[] args) throws IOException {
ANTLRInputStream ip = new ANTLRInputStream("altasdere table ; alter table ;");
HelloLexer lex = new HelloLexer(ip);
CommonTokenStream token = new CommonTokenStream(lex);
HelloParser parser = new HelloParser(token);
parser.setErrorHandler(new CustomeErrorHandler());
System.out.println(parser.start().toStringTree(parser));
}
}
我的CutomErrorHandler
class:
public class CustomeErrorHandler extends DefaultErrorStrategy {
@Override
public void recover(Parser recognizer, RecognitionException e) {
super.recover(recognizer, e);
TokenStream tokenStream = (TokenStream) recognizer.getInputStream();
if (tokenStream.LA(1) == HelloParser.SEMICOLON) {
IntervalSet intervalSet = getErrorRecoverySet(recognizer);
tokenStream.consume();
consumeUntil(recognizer, intervalSet);
}
}
}
当我输入 altasdere table ; alter table ;
时,它不会解析第二个命令,因为它在第一个命令中发现了错误。我的主要 class 的输出是
line 1:0 token recognition error at: 'alta'
line 1:4 token recognition error at: 's'
line 1:5 token recognition error at: 'd'
line 1:6 token recognition error at: 'e'
line 1:7 token recognition error at: 'r'
line 1:8 token recognition error at: 'e'
line 1:9 token recognition error at: ' '
(start compilation)
在The Definitive ANTLR 4 Reference的第9.5节改变ANTLR的错误处理策略,我可以读到:
The default error handling mechanism works very well, but there are a
few atypical situations in which we might want to alter it.
您的语法是否非常不典型以至于您需要处理标记识别错误?我个人会写一个在词法分析器级别没有错误的语法,如下所示。
文件问题.g4 :
grammar Question;
question
@init {System.out.println("Question last update 0712");}
: sql+ EOF
;
sql
: alter_command
| erroneous_command
;
alter_command
: ALTER TABLE SEMICOLON
{System.out.println("Alter command found : " + $text);}
;
erroneous_command
: WORD TABLE? SEMICOLON
{System.out.println("Erroneous command found : " + $text);}
;
ALTER : 'alter' ;
TABLE : 'table' ;
WORD : [a-z]+ ;
SEMICOLON : ';' ;
WS : [ \t\r\n]+ -> channel(HIDDEN) ;
注意 WORD
规则必须在 ALTER
之后,参见 or 。
文件t.text:
altasdere table ; alter table ;
执行:
$ grun Question question -tokens -diagnostics t.text
[@0,0:8='altasdere',<WORD>,1:0]
[@1,9:9=' ',<WS>,channel=1,1:9]
[@2,10:14='table',<'table'>,1:10]
[@3,15:15=' ',<WS>,channel=1,1:15]
[@4,16:16=';',<';'>,1:16]
[@5,17:17=' ',<WS>,channel=1,1:17]
[@6,18:22='alter',<'alter'>,1:18]
[@7,23:23=' ',<WS>,channel=1,1:23]
[@8,24:28='table',<'table'>,1:24]
[@9,29:29=' ',<WS>,channel=1,1:29]
[@10,30:30=';',<';'>,1:30]
[@11,31:31='\n',<WS>,channel=1,1:31]
[@12,32:31='<EOF>',<EOF>,2:0]
Question last update 0712
Erroneous command found : altasdere table ;
Alter command found : alter table ;
如您所见,错误的输入已被 WORD
令牌吸收。现在应该很容易处理或忽略 listener/visitor.
中的错误命令
这是我的 .g4 文件:
grammar Hello;
start : compilation;
compilation : sql*;
sql : altercommand;
altercommand : ALTER TABLE SEMICOLON;
ALTER: 'alter';
TABLE: 'table';
SEMICOLON : ';';
我的主要class:
public class Main {
public static void main(String[] args) throws IOException {
ANTLRInputStream ip = new ANTLRInputStream("altasdere table ; alter table ;");
HelloLexer lex = new HelloLexer(ip);
CommonTokenStream token = new CommonTokenStream(lex);
HelloParser parser = new HelloParser(token);
parser.setErrorHandler(new CustomeErrorHandler());
System.out.println(parser.start().toStringTree(parser));
}
}
我的CutomErrorHandler
class:
public class CustomeErrorHandler extends DefaultErrorStrategy {
@Override
public void recover(Parser recognizer, RecognitionException e) {
super.recover(recognizer, e);
TokenStream tokenStream = (TokenStream) recognizer.getInputStream();
if (tokenStream.LA(1) == HelloParser.SEMICOLON) {
IntervalSet intervalSet = getErrorRecoverySet(recognizer);
tokenStream.consume();
consumeUntil(recognizer, intervalSet);
}
}
}
当我输入 altasdere table ; alter table ;
时,它不会解析第二个命令,因为它在第一个命令中发现了错误。我的主要 class 的输出是
line 1:0 token recognition error at: 'alta'
line 1:4 token recognition error at: 's'
line 1:5 token recognition error at: 'd'
line 1:6 token recognition error at: 'e'
line 1:7 token recognition error at: 'r'
line 1:8 token recognition error at: 'e'
line 1:9 token recognition error at: ' '
(start compilation)
在The Definitive ANTLR 4 Reference的第9.5节改变ANTLR的错误处理策略,我可以读到:
The default error handling mechanism works very well, but there are a few atypical situations in which we might want to alter it.
您的语法是否非常不典型以至于您需要处理标记识别错误?我个人会写一个在词法分析器级别没有错误的语法,如下所示。
文件问题.g4 :
grammar Question;
question
@init {System.out.println("Question last update 0712");}
: sql+ EOF
;
sql
: alter_command
| erroneous_command
;
alter_command
: ALTER TABLE SEMICOLON
{System.out.println("Alter command found : " + $text);}
;
erroneous_command
: WORD TABLE? SEMICOLON
{System.out.println("Erroneous command found : " + $text);}
;
ALTER : 'alter' ;
TABLE : 'table' ;
WORD : [a-z]+ ;
SEMICOLON : ';' ;
WS : [ \t\r\n]+ -> channel(HIDDEN) ;
注意 WORD
规则必须在 ALTER
之后,参见
文件t.text:
altasdere table ; alter table ;
执行:
$ grun Question question -tokens -diagnostics t.text
[@0,0:8='altasdere',<WORD>,1:0]
[@1,9:9=' ',<WS>,channel=1,1:9]
[@2,10:14='table',<'table'>,1:10]
[@3,15:15=' ',<WS>,channel=1,1:15]
[@4,16:16=';',<';'>,1:16]
[@5,17:17=' ',<WS>,channel=1,1:17]
[@6,18:22='alter',<'alter'>,1:18]
[@7,23:23=' ',<WS>,channel=1,1:23]
[@8,24:28='table',<'table'>,1:24]
[@9,29:29=' ',<WS>,channel=1,1:29]
[@10,30:30=';',<';'>,1:30]
[@11,31:31='\n',<WS>,channel=1,1:31]
[@12,32:31='<EOF>',<EOF>,2:0]
Question last update 0712
Erroneous command found : altasdere table ;
Alter command found : alter table ;
如您所见,错误的输入已被 WORD
令牌吸收。现在应该很容易处理或忽略 listener/visitor.