处理递归和类型时如何减少代码重复

How to reduce code duplication when dealing with recursive sum types

我目前正在为一种编程语言开发一个简单的解释器,我有这样的数据类型:

data Expr
  = Variable String
  | Number Int
  | Add [Expr]
  | Sub Expr Expr

我有很多函数可以做一些简单的事情,比如:

-- Substitute a value for a variable
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = go
  where
    go (Variable x)
      | x == name = Number newValue
    go (Add xs) =
      Add $ map go xs
    go (Sub x y) =
      Sub (go x) (go y)
    go other = other

-- Replace subtraction with a constant with addition by a negative number
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = go
  where
    go (Sub x (Number y)) =
      Add [go x, Number (-y)]
    go (Add xs) =
      Add $ map go xs
    go (Sub x y) =
      Sub (go x) (go y)
    go other = other

但是在这些函数中的每一个中,我都必须重复递归调用代码的部分,只需对函数的一部分进行微小的更改。有没有更普遍的方法来做到这一点?我宁愿不必复制和粘贴这部分:

    go (Add xs) =
      Add $ map go xs
    go (Sub x y) =
      Sub (go x) (go y)
    go other = other

并且每次只更改一个案例,因为像这样复制代码似乎效率低下。

我能想出的唯一解决方案是有一个函数,它首先在整个数据结构上调用一个函数,然后在结果上递归调用,如下所示:

recurseAfter :: (Expr -> Expr) -> Expr -> Expr
recurseAfter f x =
  case f x of
    Add xs ->
      Add $ map (recurseAfter f) xs
    Sub x y ->
      Sub (recurseAfter f x) (recurseAfter f y)
    other -> other

substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue =
  recurseAfter $ \case
    Variable x
      | x == name -> Number newValue
    other -> other

replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd =
  recurseAfter $ \case
    Sub x (Number y) ->
      Add [x, Number (-y)]
    other -> other

但我觉得应该已经有一种更简单的方法可以做到这一点。我错过了什么吗?

恭喜,您刚刚重新发现变形!

这是您的代码,经过重新措辞,以便与 recursion-schemes 包一起使用。 las,它并不短,因为我们需要一些样板文件来使机器工作。 (可能有一些自动方法可以避免样板,例如使用泛型。我只是不知道。)

下面,您的 recurseAfter 被替换为标准的 ana

我们首先定义你的递归类型,以及它作为不动点的函子。

{-# LANGUAGE DeriveFunctor, TypeFamilies, LambdaCase #-}
{-# OPTIONS -Wall #-}
module AnaExpr where

import Data.Functor.Foldable

data Expr
  = Variable String
  | Number Int
  | Add [Expr]
  | Sub Expr Expr
  deriving (Show)

data ExprF a
  = VariableF String
  | NumberF Int
  | AddF [a]
  | SubF a a
  deriving (Functor)

然后我们用几个实例将两者连接起来,这样我们就可以将Expr展开成同构的ExprF Expr,再折回去

type instance Base Expr = ExprF
instance Recursive Expr where
   project (Variable s) = VariableF s
   project (Number i) = NumberF i
   project (Add es) = AddF es
   project (Sub e1 e2) = SubF e1 e2
instance Corecursive Expr where
   embed (VariableF s) = Variable s
   embed (NumberF i) = Number i
   embed (AddF es) = Add es
   embed (SubF e1 e2) = Sub e1 e2

最后,我们调整了您的原始代码,并添加了一些测试。

substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = ana $ \case
    Variable x | x == name -> NumberF newValue
    other                  -> project other

testSub :: Expr
testSub = substituteName "x" 42 (Add [Add [Variable "x"], Number 0])

replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = ana $ \case
    Sub x (Number y) -> AddF [x, Number (-y)]
    other            -> project other

testReplace :: Expr
testReplace = replaceSubWithAdd 
   (Add [Sub (Add [Variable "x", Sub (Variable "y") (Number 34)]) (Number 10), Number 4])

另一种方法是仅定义 ExprF a,然后派生 type Expr = Fix ExprF。这节省了上面的一些样板(例如两个实例),代价是必须使用 Fix (VariableF ...) 而不是 Variable ...,以及其他构造函数的类似方法。

可以使用模式同义词进一步缓解这种情况(但代价是多一点样板文件)。


更新:我终于找到了 automagic 工具,使用模板 Haskell。这使得整个代码相当短。请注意,ExprF 仿函数和上面的两个实例仍然存在,我们仍然需要使用它们。我们只是省去了必须手动定义它们的麻烦,但仅此一项就节省了很多精力。

{-# LANGUAGE DeriveFunctor, DeriveTraversable, TypeFamilies, LambdaCase, TemplateHaskell #-}
{-# OPTIONS -Wall #-}
module AnaExpr where

import Data.Functor.Foldable
import Data.Functor.Foldable.TH

data Expr
  = Variable String
  | Number Int
  | Add [Expr]
  | Sub Expr Expr
  deriving (Show)

makeBaseFunctor ''Expr

substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = ana $ \case
    Variable x | x == name -> NumberF newValue
    other                  -> project other

testSub :: Expr
testSub = substituteName "x" 42 (Add [Add [Variable "x"], Number 0])

replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = ana $ \case
    Sub x (Number y) -> AddF [x, Number (-y)]
    other            -> project other

testReplace :: Expr
testReplace = replaceSubWithAdd 
   (Add [Sub (Add [Variable "x", Sub (Variable "y") (Number 34)]) (Number 10), Number 4])

作为替代方法,这也是 uniplate 包的典型用例。它可以使用 Data.Data 泛型而不是模板 Haskell 来生成样板文件,因此如果您为 Expr:

派生 Data 个实例
import Data.Data

data Expr
  = Variable String
  | Number Int
  | Add [Expr]
  | Sub Expr Expr
  deriving (Show, Data)

然后 Data.Generics.Uniplate.Data 中的 transform 函数递归地对每个嵌套的 Expr:

应用一个函数
import Data.Generics.Uniplate.Data

substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = transform f
  where f (Variable x) | x == name = Number newValue
        f other = other

replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = transform f
  where f (Sub x (Number y)) = Add [x, Number (-y)]
        f other = other

请注意,特别是在replaceSubWithAdd中,函数f被编写为执行非递归替换; transform 使它在 x :: Expr 中递归,因此它对辅助函数的作用与 ana 在@chi 的回答中所做的相同:

> substituteName "x" 42 (Add [Add [Variable "x"], Number 0])
Add [Add [Number 42],Number 0]
> replaceSubWithAdd (Add [Sub (Add [Variable "x", 
                     Sub (Variable "y") (Number 34)]) (Number 10), Number 4])
Add [Add [Add [Variable "x",Add [Variable "y",Number (-34)]],Number (-10)],Number 4]
> 

这不比@chi 的模板 Haskell 解决方案短。一个潜在的优势是 uniplate 提供了一些可能有用的附加功能。例如,如果您使用 descend 代替 transform,它只会转换 直接 子级,这可以让您控制递归发生的位置,或者您可以使用 rewrite 重新变换变换的结果,直到达到固定点。一个潜在的缺点是 "anamorphism" 听起来比 "uniplate" 更酷。

完整节目:

{-# LANGUAGE DeriveDataTypeable #-}

import Data.Data                     -- in base
import Data.Generics.Uniplate.Data   -- package uniplate

data Expr
  = Variable String
  | Number Int
  | Add [Expr]
  | Sub Expr Expr
  deriving (Show, Data)

substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = transform f
  where f (Variable x) | x == name = Number newValue
        f other = other

replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = transform f
  where f (Sub x (Number y)) = Add [x, Number (-y)]
        f other = other

replaceSubWithAdd1 :: Expr -> Expr
replaceSubWithAdd1 = descend f
  where f (Sub x (Number y)) = Add [x, Number (-y)]
        f other = other

main = do
  print $ substituteName "x" 42 (Add [Add [Variable "x"], Number 0])
  print $ replaceSubWithAdd e
  print $ replaceSubWithAdd1 e
  where e = Add [Sub (Add [Variable "x", Sub (Variable "y") (Number 34)])
                     (Number 10), Number 4]