Antlr 语义谓词未能找到可行的替代方案

Antlr semantic predicate failed to find viable alternative

我无法获得更简单的语义谓词来与适用于 .net Framework 4.8 的 Antlr 4.6.6 一起工作 下面的语法找不到可行的输入替代方案

"received:last week"

.

grammar test;

// Parser rules
parse
: expr (expr)* EOF
;

expr 
: {false}? received ':' lastweek
| received ':' text
| text
;   

received: RECEIVED;
lastWeek: LASTWEEK;
text: TEXT;

RECEIVED: 'received';

TEXT
: 
~(' ' | ':')+
;

LASTWEEK: 'last week';

SPACES: [ \t\r\n] -> skip;

更新: 这是我的问题的简化。是否有可能有一个语法可以将这个“收到:上周”解析为“收到”“上周”只有当“上周”之前是“收到”但是例如我有“主题:上周”被解析为“主题”“最后”“周”。

当我运行这段代码时:

public static void main(String[] args) {
    String source = "received:last week";
    testLexer lexer = new testLexer(CharStreams.fromString(source));
    testParser parser = new testParser(new CommonTokenStream(lexer));
    System.out.println(parser.parse().toStringTree(parser));
}

错误 line 1:0 no viable alternative at input 'received' 打印到 STDERR。当我将 {false}? 更改为 {true}? 时,输入被正确解析(如预期)。

如果由于 {false}? 谓词,您期望输入被解析为 received ':' text,那么您误解了 ANTLR 的词法分析器的工作原理。词法分析器独立于解析器生成标记。解析器尝试匹配 TEXT 标记并不重要,您的输入始终以相同的方式标记化。

词法分析器是这样工作的:

  1. 尽量消耗尽可能多的字符
  2. 如果有两个或多个词法分析器规则匹配相同的字符,让第一个定义的“赢”

根据这些规则,很明显 "received:last week" 被标记为 RECEIVED':'LASTWEEK 标记。

编辑

Is it possible to have a grammar that can parse this "received:last week" as "received" "last week" only if the "last week" is preceded by "received" but if for example I have "subject:last week" to be parsed as "subject" "last" "week"

您可以使用 lexical modes 使词法分析器对上下文敏感。然后,您必须创建单独的词法分析器和解析器语法,它们可能如下所示:

TestLexer.g4

lexer grammar TestLexer;

RECEIVED : 'received' -> pushMode(RECEIVED_MODE);
SUBJECT  : 'subject';
TEXT     : ~[ :]+;
COLON    : ':';
SPACES   : SPACE+     -> skip;

fragment SPACE : [ \t\r\n];

mode RECEIVED_MODE;
  LASTWEEK            : 'last' SPACE+ 'week' -> popMode;
  RECEIVED_MODE_COLON : ':'                  -> type(COLON);
  RECEIVED_MODE_TEXT  : ~[ :]+               -> type(TEXT), popMode;

您可以在解析器语法中像这样使用上面的词法分析器:

TestParser.g4

parser grammar TestParser;

options {
  tokenVocab=TestLexer;
}

...

现在 "received:last week" 将被标记为:

'received'                `received`
COLON                     `:`
LASTWEEK                  `last week`
EOF                       `<EOF>`

"subject:last week" 将被标记为:

'subject'                 `subject`
COLON                     `:`
TEXT                      `last`
TEXT                      `week`
EOF                       `<EOF>`

编辑二

您也可以像这样将 last week 的创建移动到解析器中:

received
 : RECEIVED ':' last_week
 ;

subject
 : SUBJECT ':' text
 ;

last_week
 : LAST WEEK
 ;

text
 : TEXT
 | LAST
 | WEEK
 ;

RECEIVED : 'received';
SUBJECT  : 'subject';
LAST     : 'last';
WEEK     : 'week';
TEXT     : ~[ :]+;