在 Haskell 中命名没有重名的树结构（使用 children）的最佳方式

Question

假设我有以下数据结构。

data Tree = Tree
  { name        :: String
  , children    :: [Tree]
  , ...
  }

我的目标是能够映射一个树列表及其 children，这样我就可以唯一地命名每棵树，因此下面代码中的 Map 结构表示使用了特定名称，换句话说，Map Name Count。因此，如果我有函数 baseName :: SystemTree -> String，其中 returns 基于未列出属性的未编号名称，它可以与映射中的编号组合，这样即使重用 baseName 也不会使用两次名称.

nameSystemTrees :: Map String Int -> [Tree] -> (Map String Int, [Tree])
nameSystemTrees nameState trees =
  ...

我的问题是，在 Haskell 中解决这个问题的最佳方法是什么？这里可以使用 Foldable 吗？我注意到有 Data.Tree 包，但不幸的是我已经有很多自定义代码来构建这些树，所以我认为需要一些工作才能使用该包中的构造函数。

Answer 1

我认为这回答了我的问题，尽管我仍然想知道是否有更简洁的方法来使用 Foldable 来做到这一点。

mapTree :: ((a, Tree) -> (a, Tree)) -> (a, Tree) -> (a, Tree)
mapTree f startingPoint =
  let
    root = f startingPoint

    (rootAcc, rootTree) = root

    originalChildren :: [Tree]
    originalChildren = children rootTree

    (newAcc, newChildren) =
      foldr (\child (acc, children) ->
        let
          (newAcc, newTree) = mapTree f (acc, child)
        in
          (newAcc, newTree : children)
      ) (rootAcc, []) originalChildren
  in
    ( newAcc
    , rootTree
      { children = newChildren
      }
    )

Answer 2

嗯，你不能使用 Foldable（或相关的 class Traversable），因为这些 classes 是针对类型 * -> * .也就是说，Foldable 实例只能为 data Tree a = ... 之类的类型定义，该类型在另一个类型 a 中被参数化，但您的 data Tree = ... 未被参数化。

你可以做的是编写一个遍历你的树的函数，对每个节点应用一个单子动作，有点像mapM为你的树量身定制的映射整个树的每个节点操作：

mapTreeM :: Monad m => (Tree -> m Tree) -> Tree -> m Tree
mapTreeM f = mtm  -- @f@ is the per-node action, @mtm@ the whole-tree action
  where mtm tree = do
          -- apply node action @f@ to root node
          tree' <- f tree
          -- recurse over children with @mtm@
          children' <- mapM mtm (children tree')
          -- update the children
          return $ tree' { children = children' }

现在，这可以应用任何 monadic 操作，包括基于 State 的 monadic 操作，它分配一个编号后缀，每个 name 都有一个单独的计数器。这是，给定：

data Tree = Tree
  { name :: String
  , children :: [Tree]
  } deriving (Show, Eq)

您可以定义节点重命名器：

uniquifyNode :: Tree -> State (Map String Int) Tree
uniquifyNode node = do
  let nm = name node
  -- get current count for this name
  n <- gets (Map.findWithDefault 1 nm)
  -- store an updated count
  modify (Map.insert nm (n+1))
  -- return uniquified name
  return (node { name = nm ++ show n })

并将两者结合起来创建树重命名器：

uniquifyTree :: Tree -> Tree
uniquifyTree t = evalState (mapTreeM uniquifyNode t) Map.empty

并在树上进行测试：

t0 :: Tree
t0 = Tree "a" [ Tree "a" []
              , Tree "b" [ Tree "a" []
                         , Tree "b" []
                         , Tree "c" []
             ]
      , Tree "c" [ Tree "a" [] ]
      ]

像这样：

> uniquifyTree t0

打印的树相当于：

t1 :: Tree
t1 = Tree "a1" [ Tree "a2" []
               , Tree "b1" [ Tree "a3" []
                           , Tree "b2" []
                           , Tree "c1" []
                           ]
              , Tree "c2" [ Tree "a4" [] ]
              ]

请注意，mapTreeM 本质上等同于您的 mapTree，您可以使用 runState 和 [=35] 根据 mapTreeM 定义 mapTree =] 实际上做除了包装和解包数据类型之外的任何事情：

mapTree :: ((a, Tree) -> (a, Tree)) -> (a, Tree) -> (a, Tree)
mapTree f (a, t) = let (t', a') = runState (mapTreeM g t) a in (a', t')
  where g t = state (\a -> let (a', t') = f (a, t) in (t', a'))

因此，从结构上讲，这与您已经完成的工作没有太大区别。您刚刚重新发明了状态 monad（如 (a, Tree) -> (a, Tree)）并编写了一种自定义 mapM 来遍历树而不使 monadic 动作通用。

关于显式 monadic 版本的一件事是您可以将它与其他一些 monadic 操作一起使用。以下是一些示例：

> -- replace all names with "foo" (Identity action)
> import Data.Functor.Identity
> runIdentity $ mapTreeM (\(Tree n c) -> Identity (Tree "foo" c)) t0
> -- read the names from a file (IO action)
> import System.IO
> withFile "/usr/share/dict/words" ReadMode $ 
    \h -> mapTreeM (\(Tree n c) -> flip Tree c <$> hGetLine h) t0    
> -- get a list of names in order (Writer action)
> import Control.Monad.Writer
> execWriter $ mapTreeM (\t@(Tree n _) -> tell [n] >> return t) t0

总之，完整的程序是：

import Control.Monad.State
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as Map

data Tree = Tree
  { name :: String
  , children :: [Tree]
  } deriving (Show, Eq)

mapTreeM :: Monad m => (Tree -> m Tree) -> Tree -> m Tree
mapTreeM f = mtm
  where mtm tree = do
          tree' <- f tree
          children' <- mapM mtm (children tree')
          return $ tree' { children = children' }

uniquifyNode :: Tree -> State (Map String Int) Tree
uniquifyNode node = do
  let nm = name node
  n <- gets (Map.findWithDefault 1 nm)
  modify (Map.insert nm (n+1))
  return (node { name = nm ++ show n })

uniquifyTree :: Tree -> Tree
uniquifyTree t = evalState (mapTreeM uniquifyNode t) Map.empty

t0 :: Tree
t0 = Tree "a" [ Tree "a" []
              , Tree "b" [ Tree "a" []
                         , Tree "b" []
                         , Tree "c" []
                         ]
              , Tree "c" [ Tree "a" [] ]
              ]

t1 :: Tree
t1 = Tree "a1" [ Tree "a2" []
               , Tree "b1" [ Tree "a3" []
                           , Tree "b2" []
                           , Tree "c1" []
                           ]
              , Tree "c2" [ Tree "a4" [] ]
              ]

main = print $ uniquifyTree t0 == t1

在 Haskell 中命名没有重名的树结构（使用 children）的最佳方式

Best way to name a tree (with children) structure with no duplicate names in Haskell

tree

haskell

functional-programming

fold