在 antlr4 中操作词法规则的文本

Question

所以我有这个字符串的词法规则：

STRINGLIT: '"' ( ('\'[\"bftrn]) | ~[\n\"] )* '"' ;

例如，对于输入 "abc"，我希望 abc,<EOF> 丢弃 "

我在这里 http://www.antlr2.org/doc/lexer.html 读到您可以使用 ! operator。那么我会：

STRINGLIT: '"'! ( ('\'[\"bftrn]) | ~[\n\"] )* '"'! ;

但是我无法让它在代码上运行。

Answer 1

自 v3 以来不再支持 ! 运算符的 v2 功能（您正在使用 v4）。

v3 或 v4 中没有等效的运算符。去除引号的唯一方法是在解析后在侦听器或访问者中这样做，或者在词法分析器中嵌入目标特定代码：

STRINGLIT
 : '"' ( ( '\' [\bftrn"] ) | ~[\\r\n"] )* '"'
   {
     // Get all the text that this rules matched
     String matched = getText();

     // Strip the first and the last characters (the quotes)
     String matchedWithoutQuotes = matched.substring(1, matched.length() - 1);

     // possibly do some more replacements here like replace `\n` with `\n` etc.

     // Set the new string to this token
     setText(matchedWithoutQuotes);
   }
 ;

在 antlr4 中操作词法规则的文本

Manipulate the text of a lexical rule in antlr4

lexer