在 Haskell 中使用 Parsec 解析字符串时如何保留阶乘计算？

Question

我运行在 Haskell 中尝试使用 Text.Parsec 解决阶乘输出计算时遇到困难。让我们看看到目前为止我使用的一些代码：

import Text.Parsec hiding(digit)
import Data.Functor

type CalcParser a = CalcParsec String () a

digit :: CalcParser Char
digit = oneOf ['0'..'9']

number :: CalcParser Double
number = try calcDouble <|> calcInt  

calcInt :: CalcParser Double 
calcInt = read <$> many1 digit

calcDouble :: CalcParser Double 
calcDouble = do
    integ <- many1 digit
    char '.'
    decim <- many1 digit
    pure $ read $ integ <> "." <> decim

numberFact :: CalcParser Integer     
numberFact = read <$> many1 digit

applyMany :: a -> [a -> a] -> a
applyMany x [] = x
applyMany x (h:t) = applyMany (h x) t

div_ :: CalcParser (Double -> Double -> Double)
div_= do
    char '/'
    return (/)

star :: CalcParser (Double -> Double -> Double)
star = do
    char '*'    
    return (*)  

plus :: CalcParser (Double -> Double -> Double)
plus = do
    char '+'    
    return (+)

minus :: CalcParser (Double -> Double -> Double)
minus = do
    char '-'    
    return (-)  

multiplication :: CalcParser Double
multiplication = do
    spaces
    lhv <- atom <|> fact' <|> negation'
    spaces
    t <- many tail
    return $ applyMany lhv t
    where tail =
                do
                    f <- star <|> div_
                    spaces
                    rhv <- atom <|> fact' <|> negation'
                    spaces
                    return (`f` rhv)

atom :: CalcParser Double
atom = number <|> do
    char '('
    res <- addition
    char ')'
    return res

fact' :: CalcParser Double
fact' = do
    spaces
    rhv <- numberFact
    char '!'
    return $ factorial rhv

factorial :: Integer -> Double
factorial n
    | n < 0 = error "No factorial exists for negative inputs" 
    | n == 0 || n == 1 = 1
    | otherwise = acc n 1
    where
    acc 0 a = a
    acc b a = acc (b-1) (fromIntegral b * a) 

negation' :: CalcParser Double
negation' = do
    spaces
    char '~'
    rhv <- atom
    spaces
    return $ negate rhv         

addition :: CalcParser Double
addition = do
    spaces
    lhv <- multiplication <|> fact' <|> negation'
    spaces
    t <- many tail
    return $ applyMany lhv t
    where tail =
                do
                    f <- plus <|> minus 
                    spaces
                    rhv <- multiplication <|> fact' <|> negation'
                    spaces
                    return (`f` rhv)

这是一种简单的计算器解析器，提供加法/减法/阶乘处理。我将让输入字符串块的任何阶乘链接组件，以便它们中的每一个都包含一些数字，后跟“！”字符（通常在大多数现实世界的计算器中）。但是，当运行将解析器测试为：

parseTest addition "3!"

它无法计算阶乘本身，但是 returns 3.0（输出显示为浮点数，这就是我在此程序中使用 CalcParser Double 的原因）。一个奇怪的事实是，每当我设置 '!'在数字之前：

fact' :: CalcParser Double
fact' = do
    spaces
    char '!'
    rhv <- numberFact
    return $ factorial rhv

结果：

parseTest addition "!3"

符合我的预期，即它等于 6.0。在那之后，我假设第一个版本的东西有一些挫折，每当'！'时都无法运行阶乘计算。是一些类似数字的输入元素的旁边。这只是我在这里寻找任何帮助解决的案例。那么，右侧的 '!' 有什么问题？您将如何处理所描述的问题？

Answer 1

普遍的问题是 Parsec 在找到第一个可行的替代方案时停止尝试替代方案。如果你写：

parseTest (atom <|> fact') "3!"

这将导致 3.0。这是因为 atom 解析器成功解析了字符串 "3" 的初始部分，所以 Parsec 甚至不会尝试 fact' 解析器。它将字符串的其余部分 "!" 留给另一个稍后的解析器来处理。

在上面的解析器代码中，如果您尝试解析：

parseTest addition "3!"

addition 解析器首先尝试 multiplication 解析器。 multiplication 解析器有一行：

lhv <- atom <|> fact' <|> negation'

所以它首先尝试 atom 解析器。这个解析器工作正常 returns 3.0，所以它永远不会尝试 fact' 或 negation.

要修复您的解析器，您需要确保在尝试 fact' 之前，您没有成功地解析替代方案中的 atom。您可以先切换顺序：

> parseTest (fact' <|> atom) "3!"
6.0

这可以很好地解析 "3!"（并给出 6.0），但如果您尝试解析其他内容，它会失败：

> parseTest (fact' <|> atom) "3"
parse error at (line 1, column 2):
unexpected end of input
expecting "!"

这是因为，出于效率原因，Parsec 不会自动 "backtrack"。如果您尝试解析某些东西并且它在失败之前 "consumes input" ，它会完全失败而不是尝试更多的选择。在这里，fact' 首先调用 numberFact，它成功地 "consumes" "3"，然后它尝试 char '!'，但失败了。因此，fact "fails after consuming input" 会立即导致解析错误。

您可以使用 try 函数覆盖此行为：

> parseTest (try fact' <|> atom) "3"
3.0
> parseTest (try fact' <|> atom) "3!"
6.0

此处 try fact' 应用解析器 fact' 但将 "fail after consuming input" 视为 "fail after consuming no input"，因此可以检查其他替代方案。

如果您将此更改应用于 multiplication 解析器中的两个位置：

multiplication :: CalcParser Double
multiplication = do
    spaces
    lhv <- try fact' <|> atom <|> negation'
    spaces
    t <- many tail
    return $ applyMany lhv t
    where tail =
                do
                    f <- star <|> div_
                    spaces
                    rhv <- try fact' <|> atom <|> negation'
                    spaces
                    return (`f` rhv)

并对您的 addition 解析器进行了类似的更改：

addition :: CalcParser Double
addition = do
    spaces
    lhv <- try fact' <|> multiplication <|> negation'
    spaces
    t <- many tail
    return $ applyMany lhv t
    where tail =
                do
                    f <- plus <|> minus
                    spaces
                    rhv <- try fact' <|> multiplication <|> negation'
                    spaces
                    return (`f` rhv)

那么效果会更好:

> parseTest addition "3!"
6.0
> parseTest addition "3"
3.0
> parseTest addition "3+2*6!"
1443.0

将 eof 解析器添加到测试中也是一个好主意，以确保您没有任何未解析的尾随垃圾：

> parseTest addition "3  Hey, I could write anything here"
3.0
> parseTest (addition <* eof) "3  but adding eof will generate an error"
parse error at (line 1, column 4):
unexpected 'b'
expecting space, "*", "/", white space, "+", "-" or end of input

在 Haskell 中使用 Parsec 解析字符串时如何保留阶乘计算？

How to preserve factorial computing when parsing string with Parsec in Haskell?

parsing

haskell

parsec

parser-combinators