在单引号字符串内否定词法分析器规则中两个相似字符的语法

Question

ANLTR 4：

我需要支持带有转义字符的单引号字符串文字以及使用双花括号作为需要额外解析的 'escape sequence' 的能力。所以这两个例子都需要支持。我不太担心第二个例子，因为如果我能让第一个工作并且不匹配双花括号字符，那似乎微不足道。

1. 'this is a string literal with an escaped\' character' 2. 'this is a string {{functionName(x)}} literal with double curlies'

StringLiteral 
: '\'' (ESC | AnyExceptDblCurlies)*? '\'' ;

fragment 
ESC : '\' [btnr\'\];

fragment 
AnyExceptDblCurlies 
: '{' ~'{' 
| ~'{' .;

我对此做了很多研究，明白你不能否定多个字符，甚至在 Bart 的回答中看到了类似的方法 post...

Negating inside lexer- and parser rules

但我看到的是，在上面的示例 1 中，转义的单引号未被识别，我收到一个解析器错误，它无法匹配“字符”。

如果我将字符串文字标记规则更改为以下它会起作用...

StringLiteral 
: '\'' (ESC | .)*? '\'' ;

有什么想法可以更好地处理这种情况吗？我可以推断出转义字符与 AnyExceptDblCurlies 而不是 ESC 匹配，但我不确定如何解决这个问题。

Answer 1

要从字符串中解析出模板定义，几乎需要在解析器中进行处理。使用词法分析器模式区分字符串字符和模板名称。

解析器：

options {
    tokenVocab = TesterLexer ;
}

test : string EOF ;
string   : STRBEG ( SCHAR | template )* STREND ; // allow multiple templates per string
template : TMPLBEG TMPLNAME TMPLEND ;

词法分析器：

STRBEG : Squote -> pushMode(strMode) ;

mode strMode ;
    STRESQ  : Esqote  -> type(SCHAR) ; // predeclare SCHAR in tokens block
    STREND  : Squote  -> popMode ;
    TMPLBEG : DBrOpen -> pushMode(tmplMode) ;
    STRCHAR : .       -> type(SCHAR) ;

mode tmplMode ;
    TMPLEND  : DBrClose  -> popMode ;
    TMPLNAME : ~'}'*  ;

fragment Squote : '\''   ;
fragment Esqote : '\\'' ;
fragment DBrOpen   : '{{' ;
fragment DBrClose  : '}}' ;

已更新以更正 TMPLNAME 规则，添加主要规则和选项块。

在单引号字符串内否定词法分析器规则中两个相似字符的语法

Grammar to negate two like characters in a lexer rule inside a single quoted string

antlr

parser-generator

antlr4