Haskell 中数组订阅的 EDSL 相关实现问题

Problems with EDSL related implementation of array subscription in Haskell

上下文

我正在尝试实现一个与 IBM 的 OLP(线性编程建模语言)大致相似的 EDSL。

代码

Haskell EDSL 代码

{-# LANGUAGE GADTs #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}

-- Numbers at the type level
data Peano = Zero | Successor Peano

-- Counting Vector Type. Type information contains current length
data Vector peanoNum someType where
    Nil :: Vector Zero someType
    (:+) :: someType 
            -> Vector num someType 
            -> Vector (Successor num) someType
infixr 5 :+ 

-- Generate Num-th nested types
-- For example: Iterate (S (S Z)) [] Double => [[Double]]
type family Iterate peanoNum constructor someType where
    Iterate Zero cons typ = typ
    Iterate (Successor pn) cons typ = 
        cons (Iterate pn cons typ)

-- DSL spec

data Statement =
      DecisionVector [Double]
    | Minimize Statement
    | Iteration `Sum` Expression
    | Forall Iteration Statement
    | Statement :| Statement
    | Constraints Statement
infixl 8 `Sum`
infixl 3 :|

data Iteration =
      String `In` [Double]
    | String `Ins` [String]

data Expression where
    EString :: String -> Expression
    EFloat :: Double -> Expression
    (:?) :: Vector n Expression -> Iterate (n) [] Double -> Expression
    (:*) :: Expression -> Expression -> Expression
    Lt :: Expression -> Expression -> Expression
    Gt :: Expression -> Expression -> Expression
    Id :: String -> Expression
infixr 5 `Lt`
infixr 5 `Gt`
infixr 6 :*
infixr 7 :?

test :: Statement
test = 
    let rawMaterial = 205
        products = ["light", "medium", "heavy"]
        demand = [59, 12, 13]
        processes = [1, 2] 
        production = [[12,16], [1,7], [4,2]]
        consumption = [25, 30]
        -- foo = (EId "p" :+ EId "f" :+ Nil) `Subscript` production
        -- bar = (EId "p" :+ Nil) `Subscript` cost
        run = []
        cost = [300, 400]
    in  
        DecisionVector run :|
        Minimize 
            (Sum ("p" `In` processes) 
                 ((Id "p" :+ Nil) :? cost :*
                  (Id "p" :+ Nil) :? run)) :|
        Constraints 
            (Sum ("p" `In` processes)
                 ((Id "p" :+ Nil) :? consumption :*
                  (Id "p" :+ Nil) :? run `Lt` EFloat rawMaterial) :|
             Forall ("q" `Ins` products)
                    (Sum ("p" `In` processes)
                         ((Id "q" :+ Id "p" :+ Nil) :? production :*
                          (Id "p" :+ Nil) :? run `Gt` 
                          (Id "q" :+ Nil) :? demand)))

instance Show Statement where
    show (DecisionVector v) = show v
    show (Minimize s) = "(Minimize " ++ show s ++ ")"
    show (i `Sum` e) = "(" ++ show i ++ " `Sum` " ++ show e ++ ")"
    show (Forall i e) = "(Forall " ++ show i ++ show e ++ ")"
    show (sa :| sb) = "(" ++ show sa ++ show sb ++ ")"
    show (Constraints s) = "(Constraints " ++ show s  ++ ")"

instance Show Iteration where
    show (str `In` d) = "(" ++ show str ++ " `In` " ++ show d ++ ")"
    show (str `Ins` d) = "(" ++ show str ++ " `Ins` " ++ show d ++ ")"

instance Show Expression where
    show (EString s) = "(EString " ++ show s ++ ")"
    show (EFloat f) = "(EFloat " ++ show f ++ ")"
    show (Lt ea eb) = "(" ++ show ea ++ " `Lt` " ++ show eb ++ ")"
    show (Gt ea eb) = "(" ++ show ea ++ " `Gt` " ++ show eb ++ ")"
    show (ea :* eb) = "(" ++ show ea ++ " :* " ++ show eb ++ ")"
    show (Id s) = "(Id " ++ show s ++ ")"
    show (vec :? dbl) = "(" ++ show vec ++ " :? " ++ "dbl" ++ ")"

instance Show (Vector p Expression) where
    show (Nil) = "Nil"
    show (e :+ v) = "(" ++ show e ++ " :+ " ++ show v ++ ")"

-- eval_opl :: Statement -> [Double]

EDSL 与 OPL 比较

    let rawMaterial = 205
        products = ["light", "medium", "heavy"]
        demand = [59, 12, 13]
        processes = [1, 2] 
        production = [[12,16], [1,7], [4,2]]
        consumption = [25, 30]
        -- foo = (EId "p" :+ EId "f" :+ Nil) `Subscript` production
        -- bar = (EId "p" :+ Nil) `Subscript` cost
        run = []
        cost = [300, 400]
    in  
        DecisionVector run :|
        Minimize 
            (Sum ("p" `In` processes) 
                 ((Id "p" :+ Nil) :? cost :*
                  (Id "p" :+ Nil) :? run)) :|
        Constraints 
            (Sum ("p" `In` processes)
                 ((Id "p" :+ Nil) :? consumption :*
                  (Id "p" :+ Nil) :? run `Lt` EFloat rawMaterial) :|
             Forall ("q" `Ins` products)
                    (Sum ("p" `In` processes)
                         ((Id "q" :+ Id "p" :+ Nil) :? production :*
                          (Id "p" :+ Nil) :? run `Gt` 
                          (Id "q" :+ Nil) :? demand)))

对应opl码

float rawMaterial                     = 205;
{string} products                     = {"light","medium","heavy"};
float demand[products]                = [59,12,13];
{string} processes                    = {"1","2"};
float production[products][processes] = [[12,16],[1,7],[4,2]];
float consumption[processes]          = [25,30];
float cost[processes]                 = [300,400];

dvar float+ run[processes];

minimize sum (p in processes) cost[p] * run[p];

constraints {
  sum (p in processes) consumption[p] * run[p] <= rawMaterial;
  forall (q in products)
    sum (p in processes) production[q][p] * run[p] >= demand[q];
}

相关部分

(:?) :: Vector n Expression -> Iterate (n) [] Double -> Expression

以及

instance Show Expression where
    [...]
    show (vec :? dbl) = "(" ++ show vec ++ " :? " ++ "dbl" ++ ")"

问题描述

OPL 使用括号进行数组订阅,我尝试映射订阅 使用以下符号

到我的 EDSL
((Id "p" :+ Id "f" :+ Nil) :? consumption

在以下意义上对应于 OPL:

consumption[p][f]

在前者中,(Id "p" :+ Id "f" :+ Nil) 构造一个 Vector 类型的值,其中包含有关所述向量长度的类型级别信息。 根据构造函数的定义:?,你可以看到, Iterate (n) [] Double 因此将扩展为 [[Double]]。 这可以按预期正常工作。然而,为了使用生成的语法树,我需要对实际值进行模式匹配。

show (vec :? dbl) = "(" ++ show vec ++ " :? " ++ "dbl" ++ ")"

问题:上面的行有效,但我不知道如何使用实际数据。我如何进行模式匹配?无论如何都可以使用这些数据吗? 通过明显的

替换 dbl
(Iterate (Successor (Successor Zero)) [] Double)

不起作用。我也尝试建立一个数据族,但我无法想出一种方法来递归地创建一个由所有任意嵌套的 Double 列表组成的族:

Double
[Double]
[[Double]]
[[[Double]]]
...

您有几个选择,所有这些都相当于在值级别对迭代深度进行编码,以便您可以对其进行模式匹配。

GADT

最简单的方法是制作一个 GADT 来表示类型构造函数应用的迭代:

data IterateF peanoNum f a where
    ZeroF      :: a                   -> IterateF Zero           f a
    SuccessorF :: f (IterateF pn f a) -> IterateF (Successor pn) f a

instance Functor f => Functor (IterateF peanoNum f) where
    fmap f (ZeroF a)       = ZeroF $ f a
    fmap f (SuccessorF xs) = SuccessorF $ fmap (fmap f) xs

-- There's also an Applicative instance, see Data.Functor.Compose

单例

如果您受制于您的类型家族,则可以改用单例。单例是一种包含单个值的类型,您可以对其进行模式匹配以向编译器介绍有关该类型的已知事实。以下是自然数的单例:

{-# LANGUAGE FlexibleContexts #-}

data SPeano pn where
    SZero :: SPeano Zero
    SSuccessor :: Singleton (SPeano pn) => SPeano pn -> SPeano (Successor pn)

class Singleton a where
    singleton :: a

instance Singleton (SPeano Zero) where
    singleton = SZero

instance Singleton (SPeano s) => Singleton (SPeano (Successor s)) where
    singleton = SSuccessor singleton

没有 Singleton 类型 class 的更简单的 SPeano 单例是等价的,但是这个版本不需要写那么多证明,而是在构造继任者。

如果我们修改上一节中的 IterateF GADT 以捕获相同的证明(因为我很懒),只要我们有一个 SPeano 单例,我们就可以转换为 GADT。不管怎样,我们都可以很方便的从GADT转换过来。

data IterateF peanoNum f a where
    ZeroF      ::                          a                   -> IterateF Zero           f a
    SuccessorF :: Singleton (SPeano pn) => f (IterateF pn f a) -> IterateF (Successor pn) f a

toIterateF :: Functor f => SPeano pn -> Iterate pn f a -> IterateF pn f a
toIterateF SZero a = ZeroF a
toIterateF (SSuccessor pn) xs = SuccessorF $ fmap (toIterateF pn) xs

getIterateF :: Functor f => IterateF pn f a -> Iterate pn f a
getIterateF (ZeroF a) = a
getIterateF (SuccessorF xs) = fmap getIterateF xs

现在我们可以很容易地为 IterateF 创建一个替代表示,它是一个单例和 Iterate 类型族的应用程序。

data Iterated pn f a = Iterated (SPeano pn) (Iterate pn f a)

我很懒惰,不喜欢编写可以由 GADT 为我处理的证明,所以我将保留 IterateF GADT 并为 Iterated 而言。

toIterated :: Functor f => IterateF pn f a -> Iterated pn f a
toIterated xs@(ZeroF      _) = Iterated singleton (getIterateF xs)
toIterated xs@(SuccessorF _) = Iterated singleton (getIterateF xs)

fromIterated :: Functor f => Iterated pn f a -> IterateF pn f a
fromIterated (Iterated pn xs) = toIterateF pn xs

instance Functor f => Functor (Iterated pn f) where
    fmap f = toIterated . fmap f . fromIterated

toIterated中的模式匹配是为了引入SuccessorF构造中捕获的证明。如果我们有更复杂的事情要做,我们可能希望在 Dict

中捕获它

使用任何其他编码

的特定情况下
(:?) :: Vector n Expression -> Iterate (n) [] Double -> Expression

你有一个 Vector n,它在值级别对 Iterate n [] 的迭代深度进行编码。您可以在向量上进行模式匹配,它是 Nil(_ :+ xs) 以证明 Iterate n []Double 或列表。您可以将它用于简单的情况,例如 showing the nested values,或者您可以将 Vector n 转换为另一个单例以使用前面部分中更强大的表示之一。

-- The only proof we need to write by hand
ssuccessor :: SPeano pn -> (SPeano (Successor pn))
ssuccessor pred =
    case pred of
        SZero        -> SSuccessor pred
        SSuccessor _ -> SSuccessor pred

lengthSPeano :: Vector pn st -> SPeano pn
lengthSPeano Nil = SZero
lengthSPeano (_ :+ xs) = ssuccessor (lengthSPeano xs)

为了知道 Iterate n [] Double 实际存储的值是什么,您必须了解有关 n 的一些信息。这些信息通常由一些 GADT 的索引给出,这些索引对应于索引本身的归纳结构(通常称为 singleton)。

但幸运的是,您已经将 Nat 索引存储在 Vector 的结构中。您手边已经有了所需的所有信息,您只需要进行模式匹配!例如

instance Show Expression where
    ...
    show (vec :? dbl) = "(" ++ show vec ++ go vec dbl ++ ")" where 
      go :: Vector n x -> Iterate n [] Double -> String 
      go Nil a = show a 
      go (_ :+ n) a = "[" ++ intercalate "," (map (go n) a) ++ "]" 

请注意,在第一个模式中,Nil 的类型为您提供 n ~ 0,这反过来又为您提供 Iterate 0 [] Double ~ Double(根据定义)。在第二个模式中,对于某些 kIterate n [] Double ~ [ Iterate k [] Double ],您有 n ~ k + 1Nat 上的模式匹配允许您从本质上查看类型族的归纳结构。

你在 Iterate 上编写的每个函数看起来都像

foo :: forall n . Vector n () -> Iterate n F X -> Y  -- for some X,Y

因为你必须有这样一个值级证明才能在Iterate上写任何归纳函数。如果您不喜欢携带这些 "dummy" 值,您可以使用 class:

使它们隐含
class KnownNat n where 
  isNat :: Vector n () 

instance KnownNat 'Z where isNat = Nil 
instance KnownNat n => KnownNat ('S n) where isNat = () :+ isNat

但在这种情况下,因为你的 AST 已经包含一个具体的 Vector,你不需要做任何额外的工作来访问索引的实际值 - 只需在向量上进行模式匹配。