如何在不重复自己的情况下使该算法更懒惰?

How do I make this algorithm lazier without repeating myself?

(灵感来自我对 的回答。)

考虑这段代码(它应该找到小于或等于给定输入的最大元素):

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Nothing where
  precise :: Maybe (Integer, v) -> TreeMap v -> Maybe (Integer, v)
  precise closestSoFar Leaf = closestSoFar
  precise closestSoFar (Node k v l r) = case i `compare` k of
    LT -> precise closestSoFar l
    EQ -> Just (k, v)
    GT -> precise (Just (k, v)) r

这不是很懒惰。输入 GT 案例后,我们肯定知道最终的 return 值将是 Just 而不是 Nothing,但 Just 仍然不是直到最后才可用。我想让这个变得更懒惰,以便 Just 在输入 GT 案例后立即可用。我对此的测试案例是我希望 Data.Maybe.isJust $ closestLess 5 (Node 3 () Leaf undefined) 评估为 True 而不是触底。这是我可以想到的一种方法:

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess _ Leaf = Nothing
closestLess i (Node k v l r) = case i `compare` k of
  LT -> closestLess i l
  EQ -> Just (k, v)
  GT -> Just (precise (k, v) r)
  where
    precise :: (Integer, v) -> TreeMap v -> (Integer, v)
    precise closestSoFar Leaf = closestSoFar
    precise closestSoFar (Node k v l r) = case i `compare` k of
      LT -> precise closestSoFar l
      EQ -> (k, v)
      GT -> precise (k, v) r

但是,我现在要重复一遍:核心逻辑现在在 closestLessprecise 中。我怎样才能写得既懒惰又不重复自己?

怎么样

GT -> let Just v = precise (Just (k,v) r) in Just v

?

从我的非惰性实现开始,我首先重构了 precise 以接收 Just 作为参数,并相应地概括了它的类型:

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Just Nothing where
  precise :: ((Integer, v) -> t) -> t -> TreeMap v -> t
  precise _ closestSoFar Leaf = closestSoFar
  precise wrap closestSoFar (Node k v l r) = case i `compare` k of
    LT -> precise wrap closestSoFar l
    EQ -> wrap (k, v)
    GT -> precise wrap (wrap (k, v)) r

然后,我将其更改为尽早执行 wrap 并在 GT 情况下使用 id 调用自身:

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Just Nothing where
  precise :: ((Integer, v) -> t) -> t -> TreeMap v -> t
  precise _ closestSoFar Leaf = closestSoFar
  precise wrap closestSoFar (Node k v l r) = case i `compare` k of
    LT -> precise wrap closestSoFar l
    EQ -> wrap (k, v)
    GT -> wrap (precise id (k, v) r)

这仍然像以前一样工作,只是增加了惰性。

我认为您自己回答的 CPS 版本是最好的,但为了完整起见,这里还有一些想法。 (编辑:Buhr 的回答现在是最有效的。)

第一个想法是摆脱“closestSoFar”累加器,而是让 GT 案例处理所有选择最右边小于参数的值的逻辑。这种形式,GT的情况下可以直接return一个Just:

closestLess1 :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess1 _ Leaf = Nothing
closestLess1 i (Node k v l r) =
  case i `compare` k of
    LT -> closestLess1 i l
    EQ -> Just (k, v)
    GT -> Just (fromMaybe (k, v) (closestLess1 i r))

这更简单,但是当您遇到很多 GT 个案例时,在堆栈上需要更多 space 。从技术上讲,您甚至可以在累加器形式中使用 fromMaybe(即替换 luqui 的答案中隐含的 fromJust),但这将是一个冗余的、无法访问的分支。

另一个想法是算法实际上有两个 "phases",一个在你点击 GT 之前,一个在你点击 GT 之后,所以你用一个布尔值参数化它来表示这两个阶段,然后使用依赖类型来编码在第二阶段总会有结果的不变量。

data SBool (b :: Bool) where
  STrue :: SBool 'True
  SFalse :: SBool 'False

type family MaybeUnless (b :: Bool) a where
  MaybeUnless 'True a = a
  MaybeUnless 'False a = Maybe a

ret :: SBool b -> a -> MaybeUnless b a
ret SFalse = Just
ret STrue = id

closestLess2 :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess2 i = precise SFalse Nothing where
  precise :: SBool b -> MaybeUnless b (Integer, v) -> TreeMap v -> MaybeUnless b (Integer, v)
  precise _ closestSoFar Leaf = closestSoFar
  precise b closestSoFar (Node k v l r) = case i `compare` k of
    LT -> precise b closestSoFar l
    EQ -> ret b (k, v)
    GT -> ret b (precise STrue (k, v) r)

您可以利用类型系统,而不是使用显式包装器。请注意,第一个代码片段使用 Maybeprecise 版本:

precise :: Maybe (Integer, v) -> TreeMap v -> Maybe (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
  LT -> precise closestSoFar l
  EQ -> Just (k, v)
  GT -> precise (Just (k, v)) r

与第二个代码片段中没有 Maybeprecise 版本几乎完全相同,可以在 Identity 函子中编写为:

precise :: Identity (Integer, v) -> TreeMap v -> Identity (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
  LT -> precise closestSoFar l
  EQ -> Identity (k, v)
  GT -> precise (Identity (k, v)) r

这些可以统一成一个版本多态在Applicative:

precise :: (Applicative f) => f (Integer, v) -> TreeMap v -> f (Integer, v)
precise closestSoFar Leaf = closestSoFar
precise closestSoFar (Node k v l r) = case i `compare` k of
  LT -> precise closestSoFar l
  EQ -> pure (k, v)
  GT -> precise (pure (k, v)) r

就其本身而言,这并没有多大作用,但如果我们知道 GT 分支将始终 return 一个值,我们可以在 Identity 仿函数,与起始仿函数无关。也就是说,我们可以从 Maybe 仿函数开始,但递归到 GT 分支中的 Identity 仿函数:

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Nothing
  where
    precise :: (Applicative t) => t (Integer, v) -> TreeMap v -> t (Integer, v)
    precise closestSoFar Leaf = closestSoFar
    precise closestSoFar (Node k v l r) = case i `compare` k of
      LT -> precise closestSoFar l
      EQ -> pure (k, v)
      GT -> pure . runIdentity $ precise (Identity (k, v)) r

这适用于您的测试用例:

> isJust $ closestLess 5 (Node 3 () Leaf undefined)
True

并且是多态递归的一个很好的例子。

从性能的角度来看,这种方法的另一个好处是 -ddump-simpl 表明没有包装器或字典。它已在类型级别上全部删除,并为两个函子提供了专门的功能:

closestLess
  = \ @ v i eta ->
      letrec {
        $sprecise
        $sprecise
          = \ @ v1 closestSoFar ds ->
              case ds of {
                Leaf -> closestSoFar;
                Node k v2 l r ->
                  case compareInteger i k of {
                    LT -> $sprecise closestSoFar l;
                    EQ -> (k, v2) `cast` <Co:5>;
                    GT -> $sprecise ((k, v2) `cast` <Co:5>) r
                  }
              }; } in
      letrec {
        $sprecise1
        $sprecise1
          = \ @ v1 closestSoFar ds ->
              case ds of {
                Leaf -> closestSoFar;
                Node k v2 l r ->
                  case compareInteger i k of {
                    LT -> $sprecise1 closestSoFar l;
                    EQ -> Just (k, v2);
                    GT -> Just (($sprecise ((k, v2) `cast` <Co:5>) r) `cast` <Co:4>)
                  }
              }; } in
      $sprecise1 Nothing eta

我们不仅总是知道Just第一次发现之后,我们也总是知道Nothing 直到[=23] =] 然后。这实际上是两个不同的 "logics".

所以,我们首先向左走,所以 那个 明确:

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) 
                 deriving (Show, Read, Eq, Ord)

closestLess :: Integer 
            -> TreeMap v 
            -> Maybe (Integer, v)
closestLess i = goLeft 
  where
  goLeft :: TreeMap v -> Maybe (Integer, v)
  goLeft n@(Node k v l _) = case i `compare` k of
          LT -> goLeft l
          _  -> Just (precise (k, v) n)
  goLeft Leaf = Nothing

  -- no more maybe if we're here
  precise :: (Integer, v) -> TreeMap v -> (Integer, v)
  precise closestSoFar Leaf           = closestSoFar
  precise closestSoFar (Node k v l r) = case i `compare` k of
        LT -> precise closestSoFar l
        EQ -> (k, v)
        GT -> precise (k, v) r

代价是我们最多重复一个步骤一次。