类型安全流(状态机)

Type-safe Flow (State Machine)

我正在尝试在 Haskell 中创建类型安全的问答流程。我将 QnA 建模为有向图,类似于 FSM。

图中的每个节点代表一个问题:

data Node s a s' = Node {
  question :: Question a,
  process :: s -> a -> s'
}

s是输入状态,a是问题的答案,s'是输出状态。节点取决于输入状态 s,这意味着为了处理答案,我们之前必须处于特定状态。

Question a 代表一个简单的问题/答案,产生 a.

类型的答案

我的意思是类型安全,例如给定一个节点 Node2 :: si -> a -> s2,如果 si 依赖于 s1 那么所有以 Node2 结尾的路径都必须通过通过首先产生 s1 的节点。 (如果 s1 == si 那么 Node2 的所有前辈都必须生成 s1)。

一个例子

QnA:在一个在线购物网站中,我们需要询问用户的体型和喜欢的颜色。

  1. e1:询问用户是否知道自己的尺码。如果是,则转到 e2 否则转到 e3
  2. e2: 询问用户尺码,去ef询问颜色
  3. e3:(用户不知道自己的尺码),询问用户的体重并转到e4
  4. e4:(在e3之后)询问用户的身高并计算他们的尺寸并转到ef.
  5. ef:询问用户最喜欢的颜色,并以 Final 结果结束流程。

在我的模型中,Edges 将 Nodes 相互连接:

data Edge s sf where
  Edge  :: EdgeId -> Node s a s' -> (s' -> a -> Edge s' sf) -> Edge s sf
  Final :: EdgeId -> Node s a s' -> (s' -> a -> sf) -> Edge s sf

sf是QnA的最终结果,这里是:(Bool, Size, Color).

每个时刻的QnA状态可以用元组表示:(s, EdgeId)。这个状态是可序列化的,我们应该能够通过知道这个状态来继续 QnA。

saveState :: (Show s) => (s, Edge s sf) -> String
saveState (s, Edge eid n _) = show (s, eid)

getEdge :: EdgeId -> Edge s sf
getEdge = undefined --TODO

respond :: s -> Edge s sf -> Input -> Either sf (s', Edge s' sf)
respond s (Edge ...) input = Right (s', Edge ...)
respond s (Final ...) input = Left s' -- Final state

-- state = serialized (s, EdgeId)
-- input = user's answer to the current question
main' :: String -> Input -> Either sf (s', Edge s' sf)
main' state input =
  let (s, eid) = read state :: ((), EdgeId) --TODO
      edge = getEdge eid
  in respond s input edge

完整代码:

{-# LANGUAGE GADTs, RankNTypes, TupleSections #-}

type Input = String
type Prompt = String
type Color = String
type Size = Int
type Weight = Int
type Height = Int

data Question a = Question {
  prompt :: Prompt,
  answer :: Input -> a
}

-- some questions 
doYouKnowYourSizeQ :: Question Bool
doYouKnowYourSizeQ = Question "Do you know your size?" read

whatIsYourSizeQ :: Question Size
whatIsYourSizeQ = Question "What is your size?" read

whatIsYourWeightQ :: Question Weight
whatIsYourWeightQ = Question "What is your weight?" read

whatIsYourHeightQ :: Question Height
whatIsYourHeightQ = Question "What is your height?" read

whatIsYourFavColorQ :: Question Color
whatIsYourFavColorQ = Question "What is your fav color?" id

-- Node and Edge

data Node s a s' = Node {
  question :: Question a,
  process :: s -> a -> s'
}

data Edge s sf where
  Edge  :: EdgeId -> Node s a s' -> (s' -> a -> Edge s' sf) -> Edge s sf
  Final :: EdgeId -> Node s a s' -> (s' -> a -> sf) -> Edge s sf

data EdgeId = E1 | E2 | E3 | E4 | Ef deriving (Read, Show)

-- nodes

n1 :: Node () Bool Bool
n1 = Node doYouKnowYourSizeQ (const id)

n2 :: Node Bool Size (Bool, Size)
n2 = Node whatIsYourSizeQ (,)

n3 :: Node Bool Weight (Bool, Weight)
n3 = Node whatIsYourWeightQ (,)

n4 :: Node (Bool, Weight) Height (Bool, Size)
n4 = Node whatIsYourHeightQ (\ (b, w) h -> (b, w * h))

n5 :: Node (Bool, Size) Color (Bool, Size, Color)
n5 = Node whatIsYourFavColorQ (\ (b, i) c -> (b, i, c))


-- type-safe edges

e1 = Edge E1 n1 (const $ \ b -> if b then e2 else e3)
e2 = Edge E2 n2 (const $ const ef)
e3 = Edge E3 n3 (const $ const e4)
e4 = Edge E4 n4 (const $ const ef)
ef = Final Ef n5 const


ask :: Edge s sf -> Prompt
ask (Edge _ n _) = prompt $ question n
ask (Final _ n _) = prompt $ question n

respond :: s -> Edge s sf -> Input -> Either sf (s', Edge s' sf)
respond s (Edge _ n f) i =
  let a  = (answer $ question n) i
      s' = process n s a
      n' = f s' a
  in Right undefined --TODO n'

respond s (Final _ n f) i =
  let a  = (answer $ question n) i
      s' = process n s a
  in Left undefined --TODO s'


-- User Interaction:

saveState :: (Show s) => (s, Edge s sf) -> String
saveState (s, Edge eid n _) = show (s, eid)

getEdge :: EdgeId -> Edge s sf
getEdge = undefined --TODO

-- state = serialized (s, EdgeId) (where getEdge :: EdgeId -> Edge s sf)
-- input = user's answer to the current question
main' :: String -> Input -> Either sf (s', Edge s' sf)
main' state input =
  let (s, eid) = undefined -- read state --TODO
      edge = getEdge eid
  in respond s edge input

保持边缘类型安全对我来说很重要。意思是例如错误地将 e2 链接到 e3 一定是一个类型错误:e2 = Edge E2 n2 (const $ const ef) is fine by e2 = Edge E2 n2 (const $ const e3) 一定是一个错误。

我已经用 --TOOD 指出了我的问题:

我对 Node s a s'Edge s sf 的设计可能完全是错误的。我不必坚持下去。

为了有一个简单的例子来解释,我将向您展示一个解决方案,它没有对挂起、持久化和恢复计算的自然支持。最后,我将向您介绍如何添加它的要点 - 希望您能够自己弄清楚它的细节。


这里就是所谓的索引状态monad:

newtype IStateT m i o a = IStateT { runIState :: i -> m (o, a) }

IStateT 就像一个常规的状态 monad 转换器,不同之处在于隐式状态的类型在整个计算过程中是允许改变的。索引状态 monad 中的排序动作要求一个动作的输出状态与下一个动作的输入状态相匹配。这种类似多米诺骨牌的顺序就是 Atkey's parameterised monad (or ) 的目的。

class IMonad m where
    ireturn :: a -> m i i a
    (>>>=) :: m i j a -> (a -> m j k b) -> m i k b

(>>>) :: IMonad m => m i j a -> m j k b -> m i k b
mx >>> my = mx >>>= \_ -> my

IMonad 是类似 monad 的事物的 class,它描述了通过索引图的路径。 (>>>=) 的类型表示 "If you have a computation which goes from i to j and a computation from j to k, I can join them up to give you a computation from i to k".

我们还需要将计算从 classical monads 提升到索引 monads:

class IMonadTrans t where
    ilift :: Monad m => m a -> t m i i a

请注意 IStateT 的代码与常规状态 monad 的代码相同 - 只是类型变得更智能了。

iget :: Monad m => IStateT m s s s
iget = IStateT $ \s -> return (s, s)

iput :: Monad m => o -> IStateT m i o ()
iput x = IStateT $ \_ -> return (x, ())

imodify :: Monad m => (i -> o) -> IStateT m i o ()
imodify f = IStateT $ \s -> return (f s, ())

instance Monad m => IMonad (IStateT m) where
    ireturn x = IStateT (\s -> return (s, x))
    IStateT f >>>= g = IStateT $ \s -> do
                                    (s', x) <- f s
                                    let IStateT h = g x
                                    h s'
instance IMonadTrans IStateT where
    ilift m = IStateT $ \s -> m >>= \x -> return (s, x)

想法是像 askSizeaskWeight(如下)这样的单子动作将 向隐式环境添加一些数据,增加其类型。因此,我将使用嵌套元组构建隐式环境,将它们视为类型级别的类型列表。嵌套元组比平面元组更灵活(尽管效率较低),因为它们允许您对列表的尾部进行抽象。这允许您构建任意大小的元组。

type StateMachine = IStateT IO

newtype Size = Size Int
newtype Height = Height Int
newtype Weight = Weight Int
newtype Colour = Colour String

-- askSize takes an environment of type as and adds a Size element
askSize :: StateMachine as (Size, as) ()
askSize = askNumber "What is your size?" Size

-- askHeight takes an environment of type as and adds a Height element
askHeight :: StateMachine as (Height, as) ()
askHeight = askNumber "What is your height?" Height

-- etc
askWeight :: StateMachine as (Weight, as) ()
askWeight = askNumber "What is your weight?" Weight

askColour :: StateMachine as (Colour, as) ()
askColour =
    -- poor man's do-notation. You could use RebindableSyntax
    ilift (putStrLn "What is your favourite colour?") >>>
    ilift readLn                                      >>>= \answer ->
    imodify (Colour answer,)

calculateSize :: Height -> Weight -> Size
calculateSize (Height h) (Weight w) = Size (h - w)  -- or whatever the calculation is

askNumber :: String -> (Int -> a) -> StateMachine as (a, as) ()
askNumber question mk =
    ilift (putStrLn question) >>>
    ilift readLn              >>>= \answer ->
    case reads answer of
        [(x, _)] -> imodify (mk x,)
        _ -> ilift (putStrLn "Please type a number") >>> askNumber question mk

askYN :: String -> StateMachine as as Bool
askYN question =
    ilift (putStrLn question) >>>
    ilift readLn              >>>= \answer ->
    case answer of
        "y" -> ireturn True
        "n" -> ireturn False
        _ -> ilift (putStrLn "Please type y or n") >>> askYN question

我的实现与您的规范略有不同。你说先问用户尺码再问体重应该不可能。我说这应该是可能的——结果不一定是你想要的类型(因为你已经向环境添加了两种东西,而不是一种)。这在这里很有用,其中 askOrCalculateSize 只是一个黑盒子,它向环境添加了 Size(没有其他任何东西)。有时它通过直接询问尺寸来做到这一点;有时它会先询问身高和体重来计算。就类型检查器而言,这无关紧要。

interaction :: StateMachine xs (Colour, (Size, xs)) ()
interaction =
    askYN "Do you know your size?" >>>= \answer ->
    askOrCalculateSize answer >>>
    askColour

    where askOrCalculateSize True = askSize
          askOrCalculateSize False =
            askWeight >>>
            askHeight >>>
            imodify (\(h, (w, xs)) -> ((calculateSize h w), xs))

还有一个问题:如何从持续状态恢复计算?你不知道输入环境的类型(虽然假设输出总是 (Colour, Size) 是安全的)因为它在整个计算过程中会发生变化,并且你不知道直到你加载持久状态它是最多。

诀窍是使用一些 GADT 证据,您可以对其进行模式匹配以 了解 类型是什么。 Stage 表示您可以在此过程中到达的位置,它由该阶段环境应具有的类型索引。 SuspendedStage 与计算暂停时环境中的实际数据配对。

data Stage as where
    AskSize :: Stage as
    AskWeight :: Stage as
    AskHeight :: Stage (Weight, as)
    AskColour :: Stage (Size, as)

data Suspended where
    Suspended :: Stage as -> as -> Suspended

resume :: Suspended -> StateMachine as (Colour, (Size, as)) ()
resume (Suspended AskSize e) =
    iput e                                               >>>
    askSize                                              >>>
    askColour
resume (Suspended AskWeight e) =
    iput e                                               >>>
    askWeight                                            >>>
    askHeight                                            >>>
    imodify (\(h, (w, xs)) -> ((calculateSize h w), xs)) >>>
    askColour
resume (Suspended AskHeight e) =
    iput e                                               >>>
    askHeight                                            >>>
    imodify (\(h, (w, xs)) -> ((calculateSize h w), xs)) >>>
    askColour
resume (Suspended AskColour e) =
    iput e                                               >>>
    askColour

现在您可以在计算中添加暂停点:

-- given persist :: Suspended -> IO ()
suspend :: Stage as -> StateMachine as as ()
suspend stage =
    iget                                  >>>= \env
    ilift (persist (Suspended stage env))

resume 有效,但它非常丑陋并且有很多代码重复。这是因为一旦你将一个状态 monad 放在一起,你就不能再将它拆开来查看它的内部。您不能在计算中的给定点跳入。这是您原始设计的一大优势,其中您将状态机表示为可以查询的数据结构,以便弄清楚如何恢复计算。这称为 initial 编码,而我的示例(将状态机表示为函数)是 final 编码。最终编码很简单,但初始编码很灵活。希望您能了解如何将初始方法应用于索引 monad 设计。