如何使用 PEG 描述条件语句 (if-then-else)
How to describe conditional statement (if-then-else) using PEG
我正在研究 Qt 的 qmake 项目文件解析器(开源项目)。
我在描述 qmake 的条件语句变体时遇到了麻烦,在文档中称为 "scope"。
EBNF(简体):
ScopeStatement -> Condition ScopeBody
Condition -> Identifier | TestFunctionCall | NotExpr | OrExpr | AndExpr
NotExpr -> "!" Condition
OrExpr -> Condition "|" Condition
AndExpr -> Condition ":" Condition
ScopeBody -> COLON Statement | BR_OPEN Statement:* BR_CLOSE
Statement -> AssignmentStatement
AssignmentStatement -> Identifier EQ String
// There are many others built-in boolean functions
TestFunctionCall -> ("defined" | ...) ARG_LIST_OPEN (String COMMA:?):* ARG_LIST_CLOSE
Identifier -> Letter (Letter | Digit | UNDERSCP):+ String -> (Letter | Digit | UNDERSCP):+
EQ -> "="
COLON -> ":"
COMMA -> ","
ARG_LIST_OPEN -> "("
ARG_LIST_CLOSE -> ")"
BLOCK_OPEN -> "{"
BLOCK_CLOSE -> "}"
UNDERSCP -> "_"
第一个问题:如何区分AND运算符冒号和条件终结符?可能吗?
P.S。我的语法草稿(不支持函数调用)即使对于
这样的简单情况也不起作用
win32:xml: x = y
PEG.JS
代码:
Start
= ScopeStatement
// qmake scope statement
ScopeStatement
= BooleanExpression ws* ((":" ws* SingleLineStatement) / ("{" ws* MultiLineStatement ))
SingleLineStatement
= Identifier ws* "=" ws* Identifier lb*
MultiLineStatement
= (SingleLineStatement lb*)+
// qmake condition statement
BooleanExpression
= BooleanOrExpression
BooleanOrExpression
= left:BooleanAndExpression ws* "|" ws* right:BooleanOrExpression { return {type: "OR", left:left, right:right} }
/ BooleanAndExpression
BooleanAndExpression
= left:BooleanNotExpression ws* ":" ws* right:BooleanAndExpression { return {type: "AND", left:left, right:right} }
/ BooleanNotExpression
BooleanNotExpression
= "!" ws* operand:BooleanNotExpression { return {type: "NOT", operand: operand } }
/ BooleanComplexExpression
BooleanComplexExpression
= Identifier
/ "(" logical_or:BooleanOrExpression ")" { return logical_or; }
Identifier
= token:[a-zA-Z0-9_]+ { return token.join(""); }
ws
= [ \t]
lb
= [\r\n]
谢谢!
你需要在 BooleanAndExpression
之后为任何 不是 的东西添加一个负前瞻 BooleanAndExpression
,否则它会继续贪婪地消耗额外的 "and" 表达式。
Start
= ScopeStatement
// qmake scope statement
ScopeStatement
= bool:BooleanExpression ws* state:Statement { return {bool:bool, state:state} }
Statement
= ":" ws* state:SingleLineStatement { return state }
SingleLineStatement
= left:Identifier ws* "=" ws* right:Identifier lb* { return {type: "ASSIGN", left:left, right:right} }
MultiLineStatement
= (SingleLineStatement lb*)+
// qmake condition statement
BooleanExpression
= BooleanOrExpression
BooleanOrExpression
= left:BooleanAndExpression ws* "|" ws* right:BooleanOrExpression { return {type: "OR", left:left, right:right} }
/ BooleanAndExpression
BooleanAndExpression
= left:BooleanNotExpression ws* !(":" ws* SingleLineStatement) ":" ws* right:BooleanAndExpression { return {type: "AND", left:left, right:right} }
/ BooleanNotExpression
BooleanNotExpression
= "!" ws* operand:BooleanNotExpression { return {type: "NOT", operand: operand } }
/ BooleanComplexExpression
BooleanComplexExpression
= Identifier
/ "(" logical_or:BooleanOrExpression ")" { return logical_or; }
Identifier
= token:[a-zA-Z0-9_]+ { return token.join(""); }
ws
= [ \t]
lb
= [\r\n]
我正在研究 Qt 的 qmake 项目文件解析器(开源项目)。 我在描述 qmake 的条件语句变体时遇到了麻烦,在文档中称为 "scope"。
EBNF(简体):
ScopeStatement -> Condition ScopeBody
Condition -> Identifier | TestFunctionCall | NotExpr | OrExpr | AndExpr
NotExpr -> "!" Condition
OrExpr -> Condition "|" Condition
AndExpr -> Condition ":" Condition
ScopeBody -> COLON Statement | BR_OPEN Statement:* BR_CLOSE
Statement -> AssignmentStatement
AssignmentStatement -> Identifier EQ String
// There are many others built-in boolean functions
TestFunctionCall -> ("defined" | ...) ARG_LIST_OPEN (String COMMA:?):* ARG_LIST_CLOSE
Identifier -> Letter (Letter | Digit | UNDERSCP):+ String -> (Letter | Digit | UNDERSCP):+
EQ -> "="
COLON -> ":"
COMMA -> ","
ARG_LIST_OPEN -> "("
ARG_LIST_CLOSE -> ")"
BLOCK_OPEN -> "{"
BLOCK_CLOSE -> "}"
UNDERSCP -> "_"
第一个问题:如何区分AND运算符冒号和条件终结符?可能吗?
P.S。我的语法草稿(不支持函数调用)即使对于
这样的简单情况也不起作用win32:xml: x = y
PEG.JS
代码:
Start
= ScopeStatement
// qmake scope statement
ScopeStatement
= BooleanExpression ws* ((":" ws* SingleLineStatement) / ("{" ws* MultiLineStatement ))
SingleLineStatement
= Identifier ws* "=" ws* Identifier lb*
MultiLineStatement
= (SingleLineStatement lb*)+
// qmake condition statement
BooleanExpression
= BooleanOrExpression
BooleanOrExpression
= left:BooleanAndExpression ws* "|" ws* right:BooleanOrExpression { return {type: "OR", left:left, right:right} }
/ BooleanAndExpression
BooleanAndExpression
= left:BooleanNotExpression ws* ":" ws* right:BooleanAndExpression { return {type: "AND", left:left, right:right} }
/ BooleanNotExpression
BooleanNotExpression
= "!" ws* operand:BooleanNotExpression { return {type: "NOT", operand: operand } }
/ BooleanComplexExpression
BooleanComplexExpression
= Identifier
/ "(" logical_or:BooleanOrExpression ")" { return logical_or; }
Identifier
= token:[a-zA-Z0-9_]+ { return token.join(""); }
ws
= [ \t]
lb
= [\r\n]
谢谢!
你需要在 BooleanAndExpression
之后为任何 不是 的东西添加一个负前瞻 BooleanAndExpression
,否则它会继续贪婪地消耗额外的 "and" 表达式。
Start
= ScopeStatement
// qmake scope statement
ScopeStatement
= bool:BooleanExpression ws* state:Statement { return {bool:bool, state:state} }
Statement
= ":" ws* state:SingleLineStatement { return state }
SingleLineStatement
= left:Identifier ws* "=" ws* right:Identifier lb* { return {type: "ASSIGN", left:left, right:right} }
MultiLineStatement
= (SingleLineStatement lb*)+
// qmake condition statement
BooleanExpression
= BooleanOrExpression
BooleanOrExpression
= left:BooleanAndExpression ws* "|" ws* right:BooleanOrExpression { return {type: "OR", left:left, right:right} }
/ BooleanAndExpression
BooleanAndExpression
= left:BooleanNotExpression ws* !(":" ws* SingleLineStatement) ":" ws* right:BooleanAndExpression { return {type: "AND", left:left, right:right} }
/ BooleanNotExpression
BooleanNotExpression
= "!" ws* operand:BooleanNotExpression { return {type: "NOT", operand: operand } }
/ BooleanComplexExpression
BooleanComplexExpression
= Identifier
/ "(" logical_or:BooleanOrExpression ")" { return logical_or; }
Identifier
= token:[a-zA-Z0-9_]+ { return token.join(""); }
ws
= [ \t]
lb
= [\r\n]