有没有更易于维护的方法来处理我的数据类型?
Is there a more maintainable way to process my datatype?
我有一个使用以下数据类型定义的递归下降解析器的产品:
data CST
= Program CST CST
| Block CST CST CST
| StatementList CST CST
| EmptyStatementList
| Statement CST
| PrintStatement CST CST CST CST
| AssignmentStatement CST CST CST
| VarDecl CST CST
| WhileStatement CST CST CST
| IfStatement CST CST CST
| Expr CST
| IntExpr1 CST CST CST
| IntExpr2 CST
| StringExpr CST CST CST
| BooleanExpr1 CST CST CST CST CST
| BooleanExpr2 CST
| Id CST
| CharList CST CST
| EmptyCharList
| Type CST
| Character CST
| Space CST
| Digit CST
| BoolOp CST
| BoolVal CST
| IntOp CST
| TermComponent Token
| ErrorTermComponent (Token, Int)
| NoInput
正如数据类型名称所暗示的那样,数据类型构造了一个具体的语法树。我想知道是否有一种更易于维护的方式来对这种类型进行模式匹配。例如,为了跟踪解析调用的执行,我有以下内容:
checkAndPrintParse :: CST -> IO ()
checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
checkAndPrintParse (Block c1 c2 c3) = do
putStrLn "Parser: parseBlock"
checkAndPrintParse c1
checkAndPrintParse c2
checkAndPrintParse c3
checkAndPrintParse (StatementList c1 c2) = do
putStrLn "Parser: parseStatementList"
checkAndPrintParse c1
checkAndPrintParse c2
等等。我查看了 fix
function/pattern,但我不确定它是否适用于此处。
使用generic-deriving获取构造函数的名称:
- 导出
Generic
(来自 GHC.Generics
)
- 呼叫
conNameOf :: CSTF -> String
(来自 Generics.Deriving
)
使用recursion-schemes遍历一个递归类型:
- 用
makeBaseFunctor
导出递归类型的基函子。 CST
的基函子,称为 CSTF
,是一个参数化类型,其形状与 CST
相同,但 CST
的递归出现被类型参数替换。
- 学习使用
cata
(it may be a bit mind bending at the beginning). In this case we want to recursively construct an IO ()
action from a CST
, i.e., a function CST -> IO ()
. For that, the type of cata
变为 (CSTF (IO ()) -> IO ()) -> CST -> IO ()
(使用 t ~ CST
和 a ~ IO ()
),其中第一个参数定义了生成的递归函数的主体,以及结果的递归调用放置在基本仿函数的字段中。
因此,如果您的目标是编写递归函数 checkAndPrintParse
,其中一种情况如下:
checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
cata
会将其递归调用的结果放在 c1
和 c2
上,以代替这些字段:
-- goal: find f such that cata f = checkAndPrintParse
-- By definition of cata
cata f (Program c1 c2) = f (ProgramF (cata f c1) (cata f c2))
-- By the goal and the definition of checkAndPrintParse
cata f (Program c1 c2) = checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
因此
f (ProgramF (cata f c1) (cata f c2)) = do
putStrLn "Parser: parseProgram"
cata f c1
cata f c2
抽象 cata f c1
和 cata f c2
f (ProgramF x1 x2) = do
putStrLn "Parser: parserProgram"
x1 >> x2
识别折叠(在 Foldable
意义上)
f t@(ProgramF _ _) = do
putStrLn "Parser: parserProgram"
sequence_ t
再次概括
f t = do
putStrLn $ "Parser: " ++ conNameOf t -- Prints "ProgramF" instead of "parserProgram"... *shrugs*
sequence_ t
这就是我们给 cata
的论据。
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}
import GHC.Generics
import Generics.Deriving (conNameOf)
import Data.Functor.Foldable
import Data.Functor.Foldable.TH (makeBaseFunctor)
data CST
= Program CST CST
| Block CST CST CST
| StatementList CST CST
| EmptyStatementList
| Statement CST
| PrintStatement CST CST CST CST
| AssignmentStatement CST CST CST
| VarDecl CST CST
| WhileStatement CST CST CST
| IfStatement CST CST CST
| Expr CST
| IntExpr1 CST CST CST
| IntExpr2 CST
| StringExpr CST CST CST
| BooleanExpr1 CST CST CST CST CST
| BooleanExpr2 CST
| Id CST
| CharList CST CST
| EmptyCharList
| Type CST
| Character CST
| Space CST
| Digit CST
| BoolOp CST
| BoolVal CST
| IntOp CST
| TermComponent Token
| ErrorTermComponent (Token, Int)
| NoInput
deriving Generic
data Token = Token
makeBaseFunctor ''CST
deriving instance Generic (CSTF a)
checkAndPrintParse :: CST -> IO ()
checkAndPrintParse = cata $ \t -> do
putStrLn $ "Parser: " ++ conNameOf t
sequence_ t
main = checkAndPrintParse $
Program (Block NoInput NoInput NoInput) (Id NoInput)
输出:
Parser: ProgramF
Parser: BlockF
Parser: NoInputF
Parser: NoInputF
Parser: NoInputF
Parser: IdF
Parser: NoInputF
我有一个使用以下数据类型定义的递归下降解析器的产品:
data CST
= Program CST CST
| Block CST CST CST
| StatementList CST CST
| EmptyStatementList
| Statement CST
| PrintStatement CST CST CST CST
| AssignmentStatement CST CST CST
| VarDecl CST CST
| WhileStatement CST CST CST
| IfStatement CST CST CST
| Expr CST
| IntExpr1 CST CST CST
| IntExpr2 CST
| StringExpr CST CST CST
| BooleanExpr1 CST CST CST CST CST
| BooleanExpr2 CST
| Id CST
| CharList CST CST
| EmptyCharList
| Type CST
| Character CST
| Space CST
| Digit CST
| BoolOp CST
| BoolVal CST
| IntOp CST
| TermComponent Token
| ErrorTermComponent (Token, Int)
| NoInput
正如数据类型名称所暗示的那样,数据类型构造了一个具体的语法树。我想知道是否有一种更易于维护的方式来对这种类型进行模式匹配。例如,为了跟踪解析调用的执行,我有以下内容:
checkAndPrintParse :: CST -> IO ()
checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
checkAndPrintParse (Block c1 c2 c3) = do
putStrLn "Parser: parseBlock"
checkAndPrintParse c1
checkAndPrintParse c2
checkAndPrintParse c3
checkAndPrintParse (StatementList c1 c2) = do
putStrLn "Parser: parseStatementList"
checkAndPrintParse c1
checkAndPrintParse c2
等等。我查看了 fix
function/pattern,但我不确定它是否适用于此处。
使用generic-deriving获取构造函数的名称:
- 导出
Generic
(来自GHC.Generics
) - 呼叫
conNameOf :: CSTF -> String
(来自Generics.Deriving
)
使用recursion-schemes遍历一个递归类型:
- 用
makeBaseFunctor
导出递归类型的基函子。CST
的基函子,称为CSTF
,是一个参数化类型,其形状与CST
相同,但CST
的递归出现被类型参数替换。 - 学习使用
cata
(it may be a bit mind bending at the beginning). In this case we want to recursively construct anIO ()
action from aCST
, i.e., a functionCST -> IO ()
. For that, the type ofcata
变为(CSTF (IO ()) -> IO ()) -> CST -> IO ()
(使用t ~ CST
和a ~ IO ()
),其中第一个参数定义了生成的递归函数的主体,以及结果的递归调用放置在基本仿函数的字段中。
因此,如果您的目标是编写递归函数 checkAndPrintParse
,其中一种情况如下:
checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
cata
会将其递归调用的结果放在 c1
和 c2
上,以代替这些字段:
-- goal: find f such that cata f = checkAndPrintParse
-- By definition of cata
cata f (Program c1 c2) = f (ProgramF (cata f c1) (cata f c2))
-- By the goal and the definition of checkAndPrintParse
cata f (Program c1 c2) = checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
因此
f (ProgramF (cata f c1) (cata f c2)) = do
putStrLn "Parser: parseProgram"
cata f c1
cata f c2
抽象 cata f c1
和 cata f c2
f (ProgramF x1 x2) = do
putStrLn "Parser: parserProgram"
x1 >> x2
识别折叠(在 Foldable
意义上)
f t@(ProgramF _ _) = do
putStrLn "Parser: parserProgram"
sequence_ t
再次概括
f t = do
putStrLn $ "Parser: " ++ conNameOf t -- Prints "ProgramF" instead of "parserProgram"... *shrugs*
sequence_ t
这就是我们给 cata
的论据。
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}
import GHC.Generics
import Generics.Deriving (conNameOf)
import Data.Functor.Foldable
import Data.Functor.Foldable.TH (makeBaseFunctor)
data CST
= Program CST CST
| Block CST CST CST
| StatementList CST CST
| EmptyStatementList
| Statement CST
| PrintStatement CST CST CST CST
| AssignmentStatement CST CST CST
| VarDecl CST CST
| WhileStatement CST CST CST
| IfStatement CST CST CST
| Expr CST
| IntExpr1 CST CST CST
| IntExpr2 CST
| StringExpr CST CST CST
| BooleanExpr1 CST CST CST CST CST
| BooleanExpr2 CST
| Id CST
| CharList CST CST
| EmptyCharList
| Type CST
| Character CST
| Space CST
| Digit CST
| BoolOp CST
| BoolVal CST
| IntOp CST
| TermComponent Token
| ErrorTermComponent (Token, Int)
| NoInput
deriving Generic
data Token = Token
makeBaseFunctor ''CST
deriving instance Generic (CSTF a)
checkAndPrintParse :: CST -> IO ()
checkAndPrintParse = cata $ \t -> do
putStrLn $ "Parser: " ++ conNameOf t
sequence_ t
main = checkAndPrintParse $
Program (Block NoInput NoInput NoInput) (Id NoInput)
输出:
Parser: ProgramF
Parser: BlockF
Parser: NoInputF
Parser: NoInputF
Parser: NoInputF
Parser: IdF
Parser: NoInputF