兆秒差距中的运算符优先级

Operator precedence in megaparsec

我在使用 Megaparsec 6 的 makeExprParser 助手时遇到问题。我似乎无法弄清楚如何在我期望的优先级别绑定二进制 ^ 和一元 -

使用这个 makeExprParser 表达式解析器:

expressionParser :: Parser Expression
expressionParser =
    makeExprParser termParser
      [
        [InfixR $ BinOp BinaryExp <$ symbol "^"],
        [
          Prefix $ MonOp MonoMinus <$ symbol "-",
          Prefix $ MonOp MonoPlus <$ symbol "+"
        ],
        [
          InfixL $ BinOp BinaryMult <$ symbol "*",
          InfixL $ BinOp BinaryDiv <$ symbol "/"
        ],
        [
          InfixL $ BinOp BinaryPlus <$ symbol "+",
          InfixL $ BinOp BinaryMinus <$ symbol "-"
        ]
      ]

我希望这些测试能够通过:

testEqual expressionParser "1^2" "(1)^(2)"
testEqual expressionParser "-1^2" "-(1^2)"
testEqual expressionParser "1^-2" "1^(-2)"
testEqual expressionParser "-1^-2" "-(1^(-2))"

也就是说,-1^-2 应该解析为与 -(1^(-2)) 相同的东西。这就是例如Python 解析它:

>>> 2**-2
0.25
>>> -2**-2
-0.25
>>> -2**2
-4

和Ruby:

irb(main):004:0> 2**-2
=> (1/4)
irb(main):005:0> -2**-2
=> (-1/4)
irb(main):006:0> -2**2
=> -4

但是这个 Megaparsec 解析器根本无法解析 1^-2,而是给我有用的错误:

(TrivialError (SourcePos {sourceName = \"test.txt\", sourceLine = Pos 1, sourceColumn = Pos 3} :| []) (Just (Tokens ('-' :| \"\"))) (fromList [Tokens ('(' :| \"\"),Label ('i' :| \"nteger\")]))")

我读到说 "I could have taken any of these characters here, but that - has me flummoxed"。

如果我像这样调整运算符 table 的某些优先级(将指数移到一元 - 之后):

expressionParser =
    makeExprParser termParser
      [
        [
          Prefix $ MonOp MonoMinus <$ symbol "-",
          Prefix $ MonOp MonoPlus <$ symbol "+"
        ],
        [InfixR $ BinOp BinaryExp <$ symbol "^"],
        [
          InfixL $ BinOp BinaryMult <$ symbol "*",
          InfixL $ BinOp BinaryDiv <$ symbol "/"
        ],
        [
          InfixL $ BinOp BinaryPlus <$ symbol "+",
          InfixL $ BinOp BinaryMinus <$ symbol "-"
        ]
      ]

然后我不再遇到解析失败,但是 -1^2 错误地解析为 (-1)^2(而不是正确的 -(1^2))。

这是一个完整的独立解析器来显示问题(它需要 HUnit,当然还需要 megaparsec):

module Hascas.Minimal where

import Data.Void (Void)
import Test.HUnit hiding (test)
import Text.Megaparsec hiding (ParseError)
import Text.Megaparsec.Char
import Text.Megaparsec.Expr
import qualified Text.Megaparsec as MP
import qualified Text.Megaparsec.Char.Lexer as L

data Expression
    = Literal Integer
    | MonOp MonoOperator Expression
    | BinOp BinaryOperator Expression Expression
  deriving (Read, Show, Eq, Ord)

data BinaryOperator
    = BinaryPlus
    | BinaryMinus
    | BinaryDiv
    | BinaryMult
    | BinaryExp
  deriving (Read, Show, Eq, Ord)

data MonoOperator
    = MonoPlus
    | MonoMinus
  deriving (Read, Show, Eq, Ord)

type Parser a = Parsec Void String a
type ParseError = MP.ParseError (Token String) Void

spaceConsumer :: Parser ()
spaceConsumer = L.space space1 lineComment blockComment
  where
    lineComment  = L.skipLineComment "//"
    blockComment = L.skipBlockComment "/*" "*/"

lexeme :: Parser a -> Parser a
lexeme = L.lexeme spaceConsumer

symbol :: String -> Parser String
symbol = L.symbol spaceConsumer

expressionParser :: Parser Expression
expressionParser =
    makeExprParser termParser
      [
        [InfixR $ BinOp BinaryExp <$ symbol "^"],
        [
          Prefix $ MonOp MonoMinus <$ symbol "-",
          Prefix $ MonOp MonoPlus <$ symbol "+"
        ],
        [
          InfixL $ BinOp BinaryMult <$ symbol "*",
          InfixL $ BinOp BinaryDiv <$ symbol "/"
        ],
        [
          InfixL $ BinOp BinaryPlus <$ symbol "+",
          InfixL $ BinOp BinaryMinus <$ symbol "-"
        ]
      ]

termParser :: Parser Expression
termParser = (
        (try $ Literal <$> L.decimal)
    <|> (try $ parens expressionParser))

parens :: Parser a -> Parser a
parens x = between (symbol "(") (symbol ")") x

main :: IO ()
main = do
    -- just to show that it does work in the + case:
    test expressionParser "1+(-2)" $
      BinOp BinaryPlus (Literal 1) (MonOp MonoMinus $ Literal 2)
    test expressionParser "1+-2" $
      BinOp BinaryPlus (Literal 1 ) (MonOp MonoMinus $ Literal 2)

    -- but not in the ^ case
    test expressionParser "1^-2" $
      BinOp BinaryExp (Literal 1) (MonOp MonoMinus $ Literal 2)
    test expressionParser "-1^2" $
      MonOp MonoMinus $ BinOp BinaryExp (Literal 1) (Literal 2)
    test expressionParser "-1^-2" $
      MonOp MonoMinus $ BinOp BinaryExp (Literal 1) (MonOp MonoMinus $ Literal 2)

    -- exponent precedence is weird
    testEqual expressionParser "1^2" "(1)^(2)"
    testEqual expressionParser "-1^2" "-(1^2)"
    testEqual expressionParser "1^-2" "1^(-2)"
    testEqual expressionParser "-1^-2" "-(1^(-2))"
    testEqual expressionParser "1^2^3^4" "1^(2^(3^(4))))"
  where
    test :: (Eq a, Show a) => Parser a -> String -> a -> IO ()
    test parser input expected = do
      assertEqual input (Right expected) $ parse (spaceConsumer >> parser <* eof) "test.txt" input

    testEqual :: (Eq a, Show a) => Parser a -> String -> String -> IO ()
    testEqual parser input expected = do
        assertEqual input (p expected) (p input)
      where
        p i = parse (spaceConsumer >> parser <* eof) "test.txt" i

是否有可能让 Megaparsec 以其他语言所做的优先级别解析这些运算符?

makeExprParser termParser [precN, ..., prec1] 将生成一个表达式解析器,其工作方式是每个优先级别调用下一个更高级别的优先级。所以如果你手动定义它,你会有一个中缀 +- 的规则,它使用 mult-and-div 规则作为操作数。这反过来将使用前缀规则作为操作数,并且将使用 ^ 规则作为操作数。最后,^ 规则使用 termParser 作为操作数。

这里要注意的重要一点是 ^ 规则(或更一般地说:优先级高于前缀运算符的任何规则)调用的解析器在开始时不接受前缀运算符。因此前缀运算符不能出现在此类运算符的右侧(括号内除外)。

这基本上意味着 makeExprParser 不支持您的用例。

要解决此问题,您可以使用 makeExprParser 仅处理优先级低于前缀运算符的中缀运算符,然后手动处理前缀运算符和 ^,以便正确^ 的操作数将 "loop back" 到前缀运算符。像这样:

expressionParser =
    makeExprParser prefixParser
      [
        [
          InfixL $ BinOp BinaryMult <$ symbol "*",
          InfixL $ BinOp BinaryDiv <$ symbol "/"
        ],
        [
          InfixL $ BinOp BinaryPlus <$ symbol "+",
          InfixL $ BinOp BinaryMinus <$ symbol "-"
        ]
      ]

prefixParser =
  do
    prefixOps <- many prefixOp
    exp <- exponentiationParser
    return $ foldr ($) exp prefixOps
  where
    prefixOp = MonOp MonoMinus <$ symbol "-" <|> MonOp MonoPlus <$ symbol "+"

exponentiationParser =
  do
    lhs <- termParser
    -- Loop back up to prefix instead of going down to term
    rhs <- optional (symbol "^" >> prefixParser)
    return $ maybe lhs (BinOp BinaryExp lhs) rhs

请注意,与 makeExprParser 不同,这还允许多个连续的前缀运算符(如 --x 表示双重否定)。如果您不希望这样,请在 prefixParser.

的定义中将 many 替换为 optional