使用 JavaCC 问题解析自定义语法
Issue parsing a custom grammar using JavaCC
我正在用 JavaCC 编写解析器。这是我目前的进度:
PARSER_BEGIN(Compiler)
public class Compiler {
public static void main(String[] args) {
try {
(new Compiler(new java.io.BufferedReader(new java.io.FileReader(args[0])))).S();
System.out.println("Syntax is correct");
} catch (Throwable e) {
e.printStackTrace();
}
}
}
PARSER_END(Compiler)
<DEFAULT, INBODY> SKIP: { " " | "\t" | "\r" }
<DEFAULT> TOKEN: { "(" | ")" | <ID: (["a"-"z","A"-"Z","0"-"9","-","_"])+ > | "\n" : INBODY }
<DEFAULT> TOKEN: { <#RAND: (" " | "\t" | "\r")* > | <END: <RAND> "\n" <RAND> ("\n" <RAND>)+ > }
<INBODY> TOKEN: { <STRING: (~["\n", "\r"])*> : DEFAULT }
void S(): {}
{
(Signature() "\n" Body() (["\n"] <EOF> | <END> [<EOF>]) )+
}
void Signature(): {}
{
"(" <ID> <ID> ")"
}
void Body(): {}
{
<STRING> ("\n" <STRING> )*
}
我的目标是解析如下所示的语言:
(test1 pic1)
This line is a <STRING> token
After the last <STRING> one empty line is necessary
(test2 pic1)
String1
It is also allowed to have an arbitrary number (>=1) of empty lines
(test3 pic1)
String1
String2
(test4 pic1)
String1
String2
An arbitrary number (also zero) of empty lines follow till <EOF>
几乎可以正常工作,但我现在面临的问题如下:
在解析文本的末尾(如上例所述)允许有任意数量(包括零)的空行,直到 <EOF>
。如果我在 <EOF>
之前没有空行,它会按预期工作(打印 "Syntax is correct")。如果我在 <EOF>
之前至少有两个空行,它也会按预期工作(它打印 "Syntax is correct")。如果 <EOF>
之前只有一个空行,它也应该打印 "Syntax is correct"。但是我得到了以下异常堆栈跟踪:
ParseException: Encountered "<EOF>" at line 19, column 9.
Was expecting:
<STRING> ...
at Compiler.generateParseException(Compiler.java:284)
at Compiler.jj_consume_token(Compiler.java:217)
at Compiler.Body(Compiler.java:83)
at Compiler.S(Compiler.java:18)
at Compiler.main(Compiler.java:6)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at com.simontuffs.onejar.Boot.run(Boot.java:340)
at com.simontuffs.onejar.Boot.main(Boot.java:166)
有人知道问题出在哪里吗?
更新:
换行
(Signature() "\n" Body() (["\n"] <EOF> | <END> [<EOF>]) )+
至
(Signature() "\n" Body() (<EOF> | <END> [<EOF>]) )+
产生相同的行为。 ["\n"]
似乎(出于某种原因)被完全忽略了。
我找到了问题的核心。换行
<STRING> ("\n" <STRING> )*
至
<STRING> (LOOKAHEAD(2) "\n" <STRING> )*
解决了问题。
它只需要一个本地 LOOKAHEAD(2)
。
我正在用 JavaCC 编写解析器。这是我目前的进度:
PARSER_BEGIN(Compiler)
public class Compiler {
public static void main(String[] args) {
try {
(new Compiler(new java.io.BufferedReader(new java.io.FileReader(args[0])))).S();
System.out.println("Syntax is correct");
} catch (Throwable e) {
e.printStackTrace();
}
}
}
PARSER_END(Compiler)
<DEFAULT, INBODY> SKIP: { " " | "\t" | "\r" }
<DEFAULT> TOKEN: { "(" | ")" | <ID: (["a"-"z","A"-"Z","0"-"9","-","_"])+ > | "\n" : INBODY }
<DEFAULT> TOKEN: { <#RAND: (" " | "\t" | "\r")* > | <END: <RAND> "\n" <RAND> ("\n" <RAND>)+ > }
<INBODY> TOKEN: { <STRING: (~["\n", "\r"])*> : DEFAULT }
void S(): {}
{
(Signature() "\n" Body() (["\n"] <EOF> | <END> [<EOF>]) )+
}
void Signature(): {}
{
"(" <ID> <ID> ")"
}
void Body(): {}
{
<STRING> ("\n" <STRING> )*
}
我的目标是解析如下所示的语言:
(test1 pic1)
This line is a <STRING> token
After the last <STRING> one empty line is necessary
(test2 pic1)
String1
It is also allowed to have an arbitrary number (>=1) of empty lines
(test3 pic1)
String1
String2
(test4 pic1)
String1
String2
An arbitrary number (also zero) of empty lines follow till <EOF>
几乎可以正常工作,但我现在面临的问题如下:
在解析文本的末尾(如上例所述)允许有任意数量(包括零)的空行,直到 <EOF>
。如果我在 <EOF>
之前没有空行,它会按预期工作(打印 "Syntax is correct")。如果我在 <EOF>
之前至少有两个空行,它也会按预期工作(它打印 "Syntax is correct")。如果 <EOF>
之前只有一个空行,它也应该打印 "Syntax is correct"。但是我得到了以下异常堆栈跟踪:
ParseException: Encountered "<EOF>" at line 19, column 9.
Was expecting:
<STRING> ...
at Compiler.generateParseException(Compiler.java:284)
at Compiler.jj_consume_token(Compiler.java:217)
at Compiler.Body(Compiler.java:83)
at Compiler.S(Compiler.java:18)
at Compiler.main(Compiler.java:6)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at com.simontuffs.onejar.Boot.run(Boot.java:340)
at com.simontuffs.onejar.Boot.main(Boot.java:166)
有人知道问题出在哪里吗?
更新:
换行
(Signature() "\n" Body() (["\n"] <EOF> | <END> [<EOF>]) )+
至
(Signature() "\n" Body() (<EOF> | <END> [<EOF>]) )+
产生相同的行为。 ["\n"]
似乎(出于某种原因)被完全忽略了。
我找到了问题的核心。换行
<STRING> ("\n" <STRING> )*
至
<STRING> (LOOKAHEAD(2) "\n" <STRING> )*
解决了问题。
它只需要一个本地 LOOKAHEAD(2)
。