ANTLR 字符串插值
ANTLR String interpolation
我正在尝试编写一个解析字符串插值表达式的 ANTLR 语法,例如:
my.greeting = "hello ${your.name}"
我得到的错误是:
line 1:31 token recognition error at: 'e'
line 1:34 no viable alternative at input '<EOF>'
MyParser.g4:
parser grammar MyParser;
options { tokenVocab=MyLexer; }
program: variable EQ expression EOF;
expression: (string | variable);
variable: (VAR DOT)? VAR;
string: (STRING_SEGMENT_END expression)* STRING_END;
MyLexer.g4:
lexer grammar MyLexer;
START_STR: '"' -> more, pushMode(STRING_MODE) ;
VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
EQ: '=';
DOT: '.';
WHITE_SPACE: (SPACE | NEW_LINE | TAB)+ -> skip;
fragment DIGIT: '0'..'9';
fragment LOWERCASE: 'a'..'z';
fragment UPPERCASE: 'A'..'Z';
fragment ANY_CHAR: LOWERCASE | UPPERCASE | DIGIT;
fragment NEW_LINE: '\n' | '\r' | '\r\n';
fragment SPACE: ' ';
fragment TAB: '\t';
mode INTERPOLATION_MODE;
STRING_SEGMENT_START: '}' -> more, popMode;
mode STRING_MODE;
STRING_END: '"' -> popMode;
STRING_SEGMENT_END: '${' -> pushMode(INTERPOLATION_MODE);
TEXT : ~["$]+ -> more ;
像下面这样的表达式可以正常工作:
my.greeting = "hello"
my.greeting = "hello ${} world"
知道我做错了什么吗?
好的,我已经解决了(受到 this 的启发)我需要在 INTERPOLATION_MODE:
中再次定义默认的词法分析器规则
MyLexer.g4:
...
mode INTERPOLATION_MODE;
STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
I_DOT: '.';
...
MyParser.g4:
...
variable: ((VAR|I_VAR) (DOT|I_DOT))? (VAR|I_VAR);
...
虽然这似乎有点过分了,所以仍然等待有更好答案的人。
而不是:
mode INTERPOLATION_MODE;
STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
I_DOT: '.';
...
variable: ((VAR|I_VAR) (DOT|I_DOT))? (VAR|I_VAR);
你可以试试:
mode INTERPOLATION_MODE;
STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR* -> type(VAR);
I_DOT: '.' -> type(DOT);
...
variable: (VAR DOT)? VAR;
我正在尝试编写一个解析字符串插值表达式的 ANTLR 语法,例如:
my.greeting = "hello ${your.name}"
我得到的错误是:
line 1:31 token recognition error at: 'e'
line 1:34 no viable alternative at input '<EOF>'
MyParser.g4:
parser grammar MyParser;
options { tokenVocab=MyLexer; }
program: variable EQ expression EOF;
expression: (string | variable);
variable: (VAR DOT)? VAR;
string: (STRING_SEGMENT_END expression)* STRING_END;
MyLexer.g4:
lexer grammar MyLexer;
START_STR: '"' -> more, pushMode(STRING_MODE) ;
VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
EQ: '=';
DOT: '.';
WHITE_SPACE: (SPACE | NEW_LINE | TAB)+ -> skip;
fragment DIGIT: '0'..'9';
fragment LOWERCASE: 'a'..'z';
fragment UPPERCASE: 'A'..'Z';
fragment ANY_CHAR: LOWERCASE | UPPERCASE | DIGIT;
fragment NEW_LINE: '\n' | '\r' | '\r\n';
fragment SPACE: ' ';
fragment TAB: '\t';
mode INTERPOLATION_MODE;
STRING_SEGMENT_START: '}' -> more, popMode;
mode STRING_MODE;
STRING_END: '"' -> popMode;
STRING_SEGMENT_END: '${' -> pushMode(INTERPOLATION_MODE);
TEXT : ~["$]+ -> more ;
像下面这样的表达式可以正常工作:
my.greeting = "hello"
my.greeting = "hello ${} world"
知道我做错了什么吗?
好的,我已经解决了(受到 this 的启发)我需要在 INTERPOLATION_MODE:
中再次定义默认的词法分析器规则MyLexer.g4:
...
mode INTERPOLATION_MODE;
STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
I_DOT: '.';
...
MyParser.g4:
...
variable: ((VAR|I_VAR) (DOT|I_DOT))? (VAR|I_VAR);
...
虽然这似乎有点过分了,所以仍然等待有更好答案的人。
而不是:
mode INTERPOLATION_MODE;
STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR*;
I_DOT: '.';
...
variable: ((VAR|I_VAR) (DOT|I_DOT))? (VAR|I_VAR);
你可以试试:
mode INTERPOLATION_MODE;
STRING_SEGMENT_START: '}' -> more, popMode;
I_VAR: (UPPERCASE|LOWERCASE) ANY_CHAR* -> type(VAR);
I_DOT: '.' -> type(DOT);
...
variable: (VAR DOT)? VAR;