PegJS 数学解析

PegJS Maths Parsing

目前,我的语法支持基本的变量赋值,我将在本示例中使用它,但我在数学解析方面遇到了一些问题。 尝试 returns 时正确:

test = 10^2

例如,但 returns:

Line 1, column 8: Expected "(", "/*", "false", "null", "true", identifier, number, string, or whitespace but "1" found.

尝试时:

test = 10MOD2

奇怪的是,尝试使用时它似乎工作正常:

test = 10 MOD2

这是我的语法:

Start
  = __ program:Program __ { return program; }

// ----- A.1 Lexical Grammar -----

SourceCharacter
  = .

WhiteSpace "whitespace"
  = "\t"
  / "\v"
  / "\f"
  / " "
  / "\u00A0"
  / "\uFEFF"

LineTerminator
  = [\n\r\u2028\u2029]

LineTerminatorSequence "end of line"
  = "\n"
  / "\r\n"
  / "\r"
  / "\u2028"
  / "\u2029"

Comment "comment"
  = MultiLineComment
  / SingleLineComment

MultiLineComment
  = "/*" (!"*/" SourceCharacter)* "*/"

MultiLineCommentNoLineTerminator
  = "/*" (!("*/" / LineTerminator) SourceCharacter)* "*/"

SingleLineComment
  = "//" (!LineTerminator SourceCharacter)*

Identifier
  = !ReservedWord name:IdentifierName { return name; }

IdentifierName "identifier"
  = head:IdentifierStart tail:IdentifierPart* {
      return {
        type: "Identifier",
        name: head + tail.join("")
      };
    }

IdentifierStart
  = UnicodeLetter

IdentifierPart
  = IdentifierStart
  / UnicodeDigit
  / "\u200C"
  / "\u200D"

UnicodeLetter
  = [a-zA-Z]

UnicodeDigit
  = [0-9]

ReservedWord
  = Keyword
  / FutureReservedWord
  / NullLiteral
  / BooleanLiteral

Keyword
  = BreakToken
  / CaseToken
  / CatchToken
  / ContinueToken
  / DebuggerToken
  / DefaultToken
  / DeleteToken
  / DoToken
  / ElseToken
  / FinallyToken
  / ForToken
  / FunctionToken
  / IfToken
  / InstanceofToken
  / InToken
  / NewToken
  / ReturnToken
  / SwitchToken
  / ThisToken
  / ThrowToken
  / TryToken
  / TypeofToken
  / VarToken
  / VoidToken
  / WhileToken
  / WithToken
  / GlobalToken
  / ModulusToken
  / QuotientToken

FutureReservedWord
  = ClassToken
  / ConstToken
  / EnumToken
  / ExportToken
  / ExtendsToken
  / ImportToken
  / SuperToken

Literal
  = NullLiteral
  / BooleanLiteral
  / NumericLiteral
  / StringLiteral

NullLiteral
  = NullToken { return { type: "Literal", value: null }; }

BooleanLiteral
  = TrueToken  { return { type: "Literal", value: true  }; }
  / FalseToken { return { type: "Literal", value: false }; }

// The "!(IdentifierStart / DecimalDigit)" predicate is not part of the official
// grammar, it comes from text in section 7.8.3.
NumericLiteral "number"
  = literal:DecimalLiteral !(IdentifierStart / DecimalDigit) {
      return literal;
    }

DecimalLiteral
  = DecimalIntegerLiteral "." DecimalDigit* {
      return { type: "Literal", value: parseFloat(text()) };
    }
  / "." DecimalDigit+ {
      return { type: "Literal", value: parseFloat(text()) };
    }
  / DecimalIntegerLiteral {
      return { type: "Literal", value: parseFloat(text()) };
    }

DecimalIntegerLiteral
  = "0"
  / NonZeroDigit DecimalDigit*

DecimalDigit
  = [0-9]

NonZeroDigit
  = [1-9]

ExponentPart
  = ExponentIndicator SignedInteger

ExponentIndicator
  = "e"i

SignedInteger
  = [+-]? DecimalDigit+

StringLiteral "string"
  = '"' chars:DoubleStringCharacter* '"' {
      return { type: "Literal", value: chars.join("") };
    }
  / "'" chars:SingleStringCharacter* "'" {
      return { type: "Literal", value: chars.join("") };
    }

DoubleStringCharacter
  = !('"' / "\" / LineTerminator) SourceCharacter { return text(); }
  / "\" sequence:EscapeSequence { return sequence; }
  / LineContinuation

SingleStringCharacter
  = !("'" / "\" / LineTerminator) SourceCharacter { return text(); }
  / "\" sequence:EscapeSequence { return sequence; }
  / LineContinuation

LineContinuation
  = "\" LineTerminatorSequence { return ""; }

EscapeSequence
  = CharacterEscapeSequence
  / "0" !DecimalDigit { return "[=14=]"; }

CharacterEscapeSequence
  = SingleEscapeCharacter
  / NonEscapeCharacter

SingleEscapeCharacter
  = "'"
  / '"'
  / "\"
  / "b"  { return "\b"; }
  / "f"  { return "\f"; }
  / "n"  { return "\n"; }
  / "r"  { return "\r"; }
  / "t"  { return "\t"; }
  / "v"  { return "\v"; }

NonEscapeCharacter
  = !(EscapeCharacter / LineTerminator) SourceCharacter { return text(); }

EscapeCharacter
  = SingleEscapeCharacter
  / DecimalDigit
  / "x"
  / "u"

BreakToken      = "break"      !IdentifierPart
CaseToken       = "case"       !IdentifierPart
CatchToken      = "catch"      !IdentifierPart
ClassToken      = "class"      !IdentifierPart
ConstToken      = "const"      !IdentifierPart
ContinueToken   = "continue"   !IdentifierPart
DebuggerToken   = "debugger"   !IdentifierPart
DefaultToken    = "default"    !IdentifierPart
DeleteToken     = "delete"     !IdentifierPart
DoToken         = "do"         !IdentifierPart
ElseToken       = "else"       !IdentifierPart
EnumToken       = "enum"       !IdentifierPart
ExportToken     = "export"     !IdentifierPart
ExtendsToken    = "extends"    !IdentifierPart
FalseToken      = "false"      !IdentifierPart
FinallyToken    = "finally"    !IdentifierPart
ForToken        = "for"        !IdentifierPart
FunctionToken   = "function"   !IdentifierPart
GetToken        = "get"        !IdentifierPart
IfToken         = "if"         !IdentifierPart
ImportToken     = "import"     !IdentifierPart
InstanceofToken = "instanceof" !IdentifierPart
InToken         = "in"         !IdentifierPart
NewToken        = "new"        !IdentifierPart
NullToken       = "null"       !IdentifierPart
ReturnToken     = "return"     !IdentifierPart
SetToken        = "set"        !IdentifierPart
SuperToken      = "super"      !IdentifierPart
SwitchToken     = "switch"     !IdentifierPart
ThisToken       = "this"       !IdentifierPart
ThrowToken      = "throw"      !IdentifierPart
TrueToken       = "true"       !IdentifierPart
TryToken        = "try"        !IdentifierPart
TypeofToken     = "typeof"     !IdentifierPart
VarToken        = "var"        !IdentifierPart
VoidToken       = "void"       !IdentifierPart
WhileToken      = "while"      !IdentifierPart
WithToken       = "with"       !IdentifierPart
GlobalToken     = "global"     !IdentifierPart
ModulusToken    = "MOD"        !IdentifierPart
QuotientToken   = "DIV"        !IdentifierPart

// Skipped

___
  = (WhiteSpace / /*LineTerminatorSequence / Comment*/ MultiLineCommentNoLineTerminator)+
__
  = (WhiteSpace / LineTerminatorSequence / Comment)*

_
  = (WhiteSpace / MultiLineCommentNoLineTerminator)*

Program
  = __ body:StatementList __ {
  return {
    type: "Program",
    body: body
  }
  }

StatementList
  = (Statement)*

Statement
  = __ body:(VariableAssignment / GlobalAssignment) __
  {
  return body
  }

MathematicalExpression = additive

additive = left:multiplicative _ atag:("+" / "-") _ right:additive { return {type: "MathematicalExpression", operator: atag, left:left, right:right}; } / multiplicative 

multiplicative = left:exponential _ atag:("*" / "/" / "MOD" / "DIV") _ right:multiplicative { return {type: "MathematicalExpression", operator: atag, left:left, right:right}; } / exponential

exponential = left:primary _ atag:("^") _ right:exponential { return {type: "MathematicalExpression", operator: atag, left:left, right:right}; } / primary

primary = (DirectValueIntegerNoEq) / "(" additive:additive ")" { return additive; }


DirectValue
 = MathematicalExpression
 / Literal
 / Identifier

DirectValueInteger
 = MathematicalExpression
 / NumericLiteral
 / Identifier

DirectValueIntegerNoEq
 = NumericLiteral
 / Identifier

VariableAssignment
  = left:Identifier _ "=" _ right:DirectValue
  {
  return {
  type: "VariableAssignment",
  left: left,
  right: right
  }
  }

GlobalAssignment
  = GlobalToken ___ left:Identifier _ "=" _ right:DirectValue
  {
  return {
  type: "GlobalAssignment",
  left: left,
  right: right
  }
  }

您需要删除 NumericLiteral 中的 !IdentifierStart。因为 MOD 可能是 IdentifierStart。您的第二个示例 (test = 10 MOD2) 有效,因为 MOD2 被视为标识符而不是 MOD 操作。

即,更改为:

// The "!(IdentifierStart / DecimalDigit)" predicate is not part of the official
// grammar, it comes from text in section 7.8.3.
NumericLiteral "number"
  = literal:DecimalLiteral !(IdentifierStart / DecimalDigit) {
      return literal;
    }

对此:

// The "!(IdentifierStart / DecimalDigit)" predicate is not part of the official
// grammar, it comes from text in section 7.8.3.
NumericLiteral "number"
  = literal:DecimalLiteral !DecimalDigit {
      return literal;
    }

输出:

{
   "type": "Program",
   "body": [
      {
         "type": "VariableAssignment",
         "left": {
            "type": "Identifier",
            "name": "test"
         },
         "right": {
            "type": "MathematicalExpression",
            "operator": "MOD",
            "left": {
               "type": "Literal",
               "value": 10
            },
            "right": {
               "type": "Literal",
               "value": 2
            }
         }
      }
   ]
}