是否可以使用 EBNF 描述块注释?

Is it possible to describe block comments using EBNF?

说,我有以下 EBNF:

document    = content , { content } ;
content     = hello world | answer | space ;
hello world = "hello" , space , "world" ;
answer      = "42" ;
space       = " " ;

这让我可以解析如下内容:

hello world 42

现在我想用一个块注释来扩展这个语法。我怎样才能正确地做到这一点?

如果我从简单开始:

document    = content , { content } ;
content     = hello world | answer | space | comment;
hello world = "hello" , space , "world" ;
answer      = "42" ;
space       = " " ;
comment     = "/*" , ?any character? , "*/" ;

我无法解析:

Hello /* I'm the taxman! */ World 42

如果我用上面的特殊情况进一步扩展语法,它会变得丑陋,但可以解析。

document    = content , { content } ;
content     = hello world | answer | space | comment;
hello world = "hello" , { comment } , space , { comment } , "world" ;
answer      = "42" ;
space       = " " ;
comment     = "/*" , ?any character? , "*/" ;

但我仍然无法解析类似的内容:

Hel/*p! I need somebody. Help! Not just anybody... */lo World 42

我如何使用 EBNF 语法执行此操作?或者根本不可能?

假设您将 "hello" 视为一个标记,您不会想要任何东西来打破它。如果你需要这样做,就有必要爆破规则:

hello_world = "h", {comment}, "e", {comment}, "l", {comment}, "l", {comment}, "o" ,
              { comment }, space, { comment },
              "w", {comment}, "o", {comment}, "r", {comment}, "l", {comment}, "d" ;

考虑到更广泛的问题,不将语言注释描述为正式语法的一部分,而是将其作为旁注似乎很常见。但是,通常可以通过将注释视为等同于空格来完成:

space = " " | comment ;

您可能还想考虑添加一个规则来描述连续的空格:

spaces = { space }- ;

清理您的最终语法,但将 "hello" 和 "world" 视为标记(即不允许将它们分开),可能会导致如下结果:

document    = { content }- ;
content     = hello world | answer | space ;
hello world = "hello" , spaces , "world" ;
answer      = "42" ;
spaces      = { space }- ;
space       = " " | comment ;
comment     = "/*" , ?any character? , "*/" ;

How would I do this with an EBNF grammar? Or is it not even possible at all?

有些语言会在预处理器中删除注释,有些会用 space 替换注释。删除评论似乎是解决此问题的最简单方法。但是,此解决方案会从文字中删除注释,通常不会这样做。

document = preprocess, process;

preprocess = {(? any character ? - comment, ? append char to text ?)},
    ? text for input to process ?;

comment = "/*", {? any character ? - "*/"}, "*/", ? discard ?;

process = {content}-;

content = hello world | answer | spaces;

hello world = ("H" | "h"), "ello", spaces, ("W" | "w") , "orld";

answer = "42";

spaces = {" "}-;

预处理器,给定,

Hello /* I'm the taxman! */ World 42

产生

Hello  World 42

注意两个 space。

并且,对于

Hel/*p! I need somebody. Help! Not just anybody... */lo World 42

产生

Hello World 42