我的语法将关键字标识为标识符

Question

几乎每一个词都被识别为标识符，更复杂的规则甚至都达不到。例如，“程序”被识别为条件，它不将 'integer a,b;' 识别为 Decl_list ，仅将 'integer' 部分识别为 Decl.

你们知道为什么吗？

我正在使用此代码进行测试：

program test1
declare
 integer a, b, c;
 integer result;
begin
 read (a);
 read (c);
 b := 10;
 result := (a * c)/(b + 5) ;
 write(result);
end

lexer grammar MiniLexer;

Program: 'program' Identifier Body;

Body: ('declare' Decl_list) 'begin' Stmt_list 'end';

Decl_list: Decl ';' (Decl ';')?;

Decl: Type Ident_list;
fragment
Ident_list: (Identifier ','?)*;

Type: 'integer' | 'decimal';

Stmt_list: Stmt ';' ((Stmt ';')*)?;

Stmt: Assign_stmt | If_stmt | While_stmt| Read_stmt | Write_stmt;

Assign_stmt: Identifier ':=' Simple_expr;

If_stmt: 'if' Condition 'then' Stmt_list 'end' | 'if' Condition 'then' Stmt_list 'else' Stmt_list 'end';

Condition: Expression;

For_stmt: 'for' Assign_stmt 'to' Condition 'do' Stmt_list 'end';

While_stmt: 'while' Condition 'do' Stmt_list 'end';

Read_stmt: 'read' '(' Identifier ')';

Write_stmt: 'write' '(' Writable ')';

Writable: Simple_expr | Literal;

Expression: Simple_expr | Simple_expr Relop Simple_expr;

Simple_expr: Term | Term Addop Term| '(' Term ')' ? Term ':' Term;

Term: Factor_a | Factor_a Mulop Factor_a;

Factor_a: Factor | 'not' Factor | '-' Factor;

Factor: Identifier | Constant | '(' Expression ')';

Relop: '=' | '>' | '>=' | '<' | '<=' | '<>';

Addop: '+' | '-' | 'or';

Mulop: '*' | '/' | 'mod' | 'and';

Shiftop: '<<' | '>>' | '<<<' | '>>>';

COMENTARIO: '%' ~('\n'|'\r')* '\r'? '\n' {skip();};

WS :   ( ' '| '\t'| '\r'| '\n') {skip();};

Constant: ('0'..'9') (('0'..'9'))*;

Literal: '"' ('\u0000'..'\uFFFE')* '"';

Identifier: ('a'..'z'|'A'..'Z') (('a'..'z'|'A'..'Z') | ('0'..'9'))*;

你们知道为什么吗？

Answer 1

您的语法是词法分析器语法，这意味着它只生成标记。在此处了解 lexer、parser 和组合语法之间的区别：https://github.com/antlr/antlr4/blob/master/doc/grammars.md

简而言之，从语法中删除单词 lexer 并将一些规则更改为解析器规则（这些规则以小写字母开头）：

grammar Mini;

program: 'program' Identifier body EOF;

body: ('declare' decl_list) 'begin' stmt_list 'end';

decl_list: decl ';' (decl ';')?;

decl: type ident_list;

ident_list: (Identifier ','?)*;

type: 'integer' | 'decimal';

stmt_list: stmt ';' (stmt ';')*;

stmt: assign_stmt | if_stmt | while_stmt| read_stmt | write_stmt | for_stmt;

assign_stmt: Identifier ':=' simple_expr;

if_stmt: 'if' condition 'then' stmt_list 'end' | 'if' condition 'then' stmt_list 'else' stmt_list 'end';

condition: expression;

for_stmt: 'for' assign_stmt 'to' condition 'do' stmt_list 'end';

while_stmt: 'while' condition 'do' stmt_list 'end';

read_stmt: 'read' '(' Identifier ')';

write_stmt: 'write' '(' writable ')';

writable: simple_expr | Literal;

expression: simple_expr | simple_expr Relop simple_expr;

simple_expr: term | term Addop term| '(' term ')' ? term ':' term;

term: factor_a | factor_a Mulop factor_a;

factor_a: factor | 'not' factor | '-' factor;

factor: Identifier | Constant | '(' expression ')';

Relop: '=' | '>' | '>=' | '<' | '<=' | '<>';

Addop: '+' | '-' | 'or';

Mulop: '*' | '/' | 'mod' | 'and';

Shiftop: '<<' | '>>' | '<<<' | '>>>';

COMENTARIO: '%' ~('\n'|'\r')* '\r'? '\n' -> skip;

Constant: ('0'..'9') (('0'..'9'))*;

Literal: '"' ('\u0000'..'\uFFFE')* '"';

Identifier: ('a'..'z'|'A'..'Z') (('a'..'z'|'A'..'Z') | ('0'..'9'))*;

Space: [ \t\r\n] -> skip;

请注意 {skip();} 是旧的 v3 语法，请改用 -> skip。

和 Constant: ('0'..'9') (('0'..'9'))*; 也是旧的 v3 语法（尽管在 v4 中仍然有效）。首选的方式是这样的：

Constant: [0-9] (([0-9]))*;

可以简单地写成：

Constant: [0-9]+;

我的语法将关键字标识为标识符

My grammar identifies keywords as identifiers

java

antlr4