在 Haskell 中为命令式语言编写解释器

Writing an interpreter for an imperative language in Haskell

我正在尝试为 Haskell 中的类 C 语言构建解释器。到目前为止,我已经根据 this 论文编写并组合了小型单子解析器,因此到目前为止我可以生成程序的 AST 表示。我定义抽象语法如下:

data LangType = TypeReal | TypeInt | TypeBool | TypeString deriving (Show)
type Id = String

data AddOp = Plus | Minus | Or deriving (Show)
data RelOp = LT | GT | LTE | GTE | NEq | Eq deriving (Show)
data MultOp = Mult | Div | And deriving (Show)
data UnOp = UnMinus | UnNot deriving (Show)
data BinOp = Rel RelOp | Mul MultOp | Add AddOp deriving (Show)

data AST = Program [Statement] deriving (Show)
data Block = StatsBlock [Statement] deriving (Show)
data Statement = VariableDecl Id LangType Expression
               | Assignment Id Expression
               | PrintStatement Expression
               | IfStatement Expression Block Block
               | WhileStatement Expression Block
               | ReturnStatement Expression
               | FunctionDecl Id LangType FormalParams Block 
               | BlockStatement Block 
               deriving (Show)
data Expression = RealLiteral Double
                | IntLiteral Int
                | BoolLiteral Bool
                | StringLiteral String
                | Unary UnOp Expression
                | Binary BinOp Expression Expression
                | FuncCall Id [Expression]
                | Var Id
                deriving (Show)
data FormalParams = IdentifierType [(Id, LangType)] deriving (Show)

我还没有对我的 AST 进行类型检查并构建解释器来评估表达式和执行语句。我的问题如下:

  1. Does the abstract syntax make sense/can it be improved? In particular, I've been running into a recurring problem. In the EBNF of this language I'm trying to interpret, a WhileStatement consists of an Expression (which I have no problem with) and a Block, which in the EBNF happens to be a Statement just like WhileStatement, and so I cannot refer to Block from my WhileStatement. I've worked around this by defining a separate data type Block (as is shown in the above code), but am not sure if this is the best way. I'm finding defining data types quite confusing.
  2. Since I have to type-check my AST and evaluate/execute, do I implement these separately or can I define some function which does them both at the same time?

任何关于我应该如何进行类型检查和解释语言的一般提示也将不胜感激。由于该语言具有变量和函数声明,我正在考虑实现某种符号 table,尽管我再次为为此定义类型而苦苦挣扎。到目前为止我已经试过了

    import qualified Data.Map as M

    data Value = RealLit Double | IntLit Int | BoolLit Bool | StringLit String | Func [FormalParams] String 
                 deriving (Show)
    type TermEnv = M.Map String Value

但我不确定是否应该使用之前的 LangType

在关于如何进行类型检查和评估的评论中解决您的问题。

如果您不必进行推理或多态性,类型检查非常简单。在这些条件下,类型检查和求值也非常接近。

首先定义一个具有您需要的功能的 monad。对于类型检查器,您需要

  • 类型环境,即Reader(Map Id LangType)组件,用于跟踪局部变量的类型。
  • 一个错误能力,例如ExceptString

所以你可以像这样定义一个 monad

type TypeEnv = Map.Map Id LangType
type TC = ReaderT TypeEnv (Except String)

然后您的类型检查器函数将如下所示:

typeCheck :: AST -> TC ()

(我们 return () 因为除了知道程序是否通过之外,从类型检查过程中没有什么有趣的收获。)

这主要是结构归纳,例如

typeCheck (Program stmt) = -- typecheckStmt each statement*

typeCheckStmt :: Statement -> TC ()
typeCheckStmt (VariableDecl v type defn) = ...
typeCheckStmt (Assignment v exp) = do
    Just t <- asks (Map.lookup v)
    t' <- typeCheckExp exp
    when (t /= t') $ throwError "Types do not match"
...

-- Return the type of a composite expression to use elsewhere
typeCheckExp :: Expression -> TC LangType
...

需要一些技巧来确保同一列表中后面的语句可以看到语句列表中的变量声明。我会把它当作一个谜题。 (提示:请参阅 local 函数以在范围内提供更新的环境。)


评价类似。您是对的,您需要一种 运行 时间值。如果没有一些你可能还没有准备好(并且即使你准备好了,实用性也值得怀疑)的聪明才智,就没有真正在 Value 中使用 LangType 的方法,所以你走在正确的轨道上。

您将需要一个 monad,它支持跟踪变量的值并能够执行您的语言需要的任何其他操作。首先我推荐

type Eval = StateT (Map Id Value) IO

并像以前一样在结构上进行。在处理变量范围和阴影时,将再次需要一些技巧,您可能需要更改环境类型或稍微弄乱您的 Value 类型以适应这些细微之处,但仔细考虑这些问题很重要。从简单开始,不要尝试一次对整个语言实施类型检查和评估。