为固定 API 的安全查询字符串编写 Haskell 选项数据类型

Writing a Haskell Options Datatype for Safe Query Strings for Fixed API

我正在通过编写一个向音乐目录发出 Web 请求的应用程序来练习“real-world”Haskell。我可以使用 titleartistyear 等可选参数的任意组合来调用 https://example.com/search 等端点。例如,以下任何组合都是有效的:

https://example.com/search?title="Ecoute moi Camarade"
https://example.com/search?title="Ecoute moi Camarade"&artist="Mazouni"
https://example.com/search?year=1974&artist="Mazouni"

我可以使用 req 以友好的方式构建查询参数列表,

import qualified Network.HTTP.Req as Req
import qualified Data.Aeson as AE

makeSearch :: IO ()
makeSearch = Req.runReq Req.defaultHttpConfig $ do
    let url = https "example.com" /: "search"
    let params =
          "artist" =: ("Ecoute moi Camarade" :: Text)  <>
          "track"  =: ("Mazouni" :: Text)
    r <- (req GET url NoReqBody jsonResponse params) :: (Req.Req (Req.JsonResponse AE.Value))
    liftIO $ print (Req.responseBody r :: AE.Value)

我希望 makeSearch 函数接受可选参数的任意组合。两个最简单的选项是:

相反,我想定义一些 Haskell 数据类型来模拟我对正在消费的 API 的了解。请注意,我 NOT 可以控制网络 API 本身。

所需的用法

我认为以下简单的标准是合理的:

例如,像下面这样的内容对调用者来说会很好:

makeSearch (searchArtist "Mazouni" <> searchTitle "Ecoute moi Camarade")
makeSearch (searchYear 1974)

尝试 1:MonoidLast

我尝试实现我在使用 Monoid

之前看到的模式
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DerivingVia #-}
import GHC.Generics ( Generic )
import Data.Monoid.Generic

data SearchOpts = SearchOpts {
        searchArtist :: Last Text,
        searchTitle  :: Last Text,
        searchYear   :: Last Integer
    } deriving (Generic, Show, Eq)
      deriving Semigroup via GenericSemigroup SearchOpts
      deriving Monoid    via GenericMonoid SearchOpts

但是,如果我们只想按标题搜索,我们仍然需要为其余选项提供 Nothing。我可以像下面这样定义一些辅助函数,但如果它们以某种方式自动生成会更好。

matchArtist :: Text -> SearchOpts
matchArtist name = mempty { searchArtist = Last (Just name) }

matchTitle :: Text -> SearchOpts
matchTitle title = mempty { searchTitle = Last (Just title) }

matchYear :: Text -> SearchOpts
matchYear t = mempty { searchYear = Last (Just t) }

此外,我还没有找到一种干净的方法来使用这种方法来实现 makeSearch。并发症是:

梦想

这种操作在 TypeScript 中很常见。这里有一个 simple example. Using UrlSearchParams 可以进一步简化,但这不是一个公平的比较。

interface SearchOpts {
    artist ?: string,
    title  ?: string,
    year   ?: number
}

function makeSearch(opts: SearchOpts): string {
  var params:string[] = [];

  if(opts.artist) { params.push("artist=" + encodeURIComponent(opts.artist)); }
  if(opts.title)  { params.push("title="  + encodeURIComponent(opts.title));  }
  if(opts.year)   { params.push("year="   + encodeURIComponent(opts.year));   }
  
  return params.join("&");
}

makeSearch({ title: "T"})                 // OK
makeSearch({ title: "T", artist: "A"})    // OK
makeSearch({ year: 1974, artist: "A"})    // OK
makeSearch({ title: "T"})                 // OK
makeSearch({ title: "T", extra: "Extra"}) // Error! (as desired)

问题

您建议如何解决 Haskell 中的这个问题?谢谢!

编辑:解决方案基于 Daniel Wagner 的回答

下面的 SearchOptsmakeSearch 实现还不错。我也会研究镜头和模板 Haskell!

data SearchOpts = SearchOpts {
        searchArtist :: Maybe Text,
        searchTitle  :: Maybe Text,
        searchYear   :: Maybe Text
    } deriving (Eq, Ord, Read, Show)

instance Default SearchOpts where
    def = SearchOpts Nothing Nothing Nothing

matchArtist :: Text -> SearchOpts
matchArtist a = def { searchArtist = Just a }

matchTitle :: Text -> SearchOpts
matchTitle t = def { searchTitle = Just t }

matchYear :: Text -> SearchOpts
matchYear y = def { searchYear = Just y }

-- App is a MonadHttp instance
makeSearch :: SearchOpts -> App SearchResults
makeSearch query = do
    let url = https "example.com" /: "search"

    let args = [
            ("artist" , searchArtist query),
            ("title"  , searchTitle query),
            ("type"   , searchYear query)
          ]
    
    let justArgs = [ (key,v) | arg@(key, Just v) <- args ]
    let params   = (map (uncurry (=:)) justArgs)
    let option   = (foldl (<>) mempty params)
    
    -- defined elsewhere
    makeReq url option

标准技巧是只使用 Maybe(而不是 Last)并定义一个 Default 实例:

data SearchOpts = SearchOpts
    { searchArtist :: Maybe Text
    , searchTitle :: Maybe Text
    , searchYear :: Maybe Integer
    } deriving (Eq, Ord, Read, Show)

instance Default SearchOpts where
    def = SearchOpts Nothing Nothing Nothing

现在可以很容易地只提供您想要的字段,只需编写如下内容即可:

def { searchArtist = Just "Mazouni" }
-- or
def
    { searchArtist = Just "Mazouni"
    , searchTitle = Just "Ecoute moi Camarade"
    }

如果你嫁给了 Monoid 实例(也许是因为它让呼叫者跳过 Just)你仍然可以给一个。

instance Semigroup SearchOpts where
    SearchOpts a t y <> SearchArtist a' t' y'
        = SearchOpts (a <|> a') (t <|> t') (y <|> y')
instance Monoid SearchOpts where mempty = def

要自动生成单字段“构造函数”,您可以查看一些模板 Haskell; makeLenses 或其变体也有可能将您带到您需要去的地方。

只是为了好玩,这是第二个答案,它使用了一种非常不同的技术。我们将创建一个从字段名到它们的类型的类型级映射;然后我们将创建一个可以包含给定字段的任何子集并支持字段查找的类型。首先我们深吸一口气,充满类型级编程的氛围...

{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedLabels #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}

import Data.Kind
import Data.Maybe
import Data.Type.Equality
import GHC.OverloadedLabels
import GHC.Prim
import GHC.TypeLits
import Unsafe.Coerce

我想做的第一件事就是向您保证 Unsafe.Coerce 并没有那么糟糕。因此,我将首先介绍整个可信计算基础——对 unsafeCoerce 的所有调用。我希望你会同意它们是相当合理的;声称术语级别和类型级别的字符串比较操作彼此一致。

data SOrdering x where
    SLT :: SOrdering LT
    SEQ :: SOrdering EQ
    SGT :: SOrdering GT

scompare :: (KnownSymbol s, KnownSymbol s') =>
    Proxy# s -> Proxy# s' -> SOrdering (CmpSymbol s s')
scompare s s' = case compare (symbolVal' s) (symbolVal' s') of
    LT -> unsafeCoerce SLT
    EQ -> unsafeCoerce SEQ
    GT -> unsafeCoerce SGT

好的,现在,我们要介绍一个类型级别的映射。当其中一个映射有重复键时,我们想向用户投诉;有多种方法可以做到这一点,但我们要做到这一点的方法是保持类型级映射排序。这使得检查重复项变得容易。那么让我们定义一个类型级别的排序!

type family Sort kvs where
    Sort '[] = '[]
    Sort '[kv] = '[kv]
    Sort kvs = Merge (SortBoth (Split kvs))

type family Split xs where
    Split '[] = '( '[], '[] )
    Split (x:xs) = SplitHelper x (Split xs)

type family SplitHelper x rec where
    SplitHelper x '(xs, xs') = '(x:xs', xs)

type family SortBoth kvsPair where
    SortBoth '(kvs, kvs') = '(Sort kvs, Sort kvs')

type family Merge kvsPair where
    Merge '(('(k, v):kvs), ('(k', v'):kvs')) = CataOrdering (CmpSymbol k k')
        ('(k, v):Merge '(kvs, ('(k', v'):kvs')))
        (TypeError (Text "Duplicate key " :<>: ShowType k :<>: Text " in Merge"))
        ('(k', v'):Merge '(('(k, v):kvs), kvs'))
    Merge '( kvs, '[] ) = kvs
    Merge '( '[], kvs' ) = kvs'

type family CataOrdering ordering lt eq gt where
    CataOrdering LT lt eq gt = lt
    CataOrdering EQ lt eq gt = eq
    CataOrdering GT lt eq gt = gt

如果我们有这些排序映射之一,我们可以通过创建这种新数据类型的值来创建反映它的术语级映射:

data Map kvs where
    Nil :: Map '[]
    Cons :: KnownSymbol k => Proxy# k -> v -> Map kvs -> Map ('(k, v):kvs)

当然,还有比链表更高效的数据结构;我把它作为练习留给 reader 来完成其中一项工作所需的类型级黑客攻击!哎呀。

实际上,Cons 构造函数是不安全的——它不保留排序要求,也不保留非重复要求。所以一般情况下,我们不会暴露这个Map的构造函数;相反,我们将公开以下 API 以创建映射:

instance (KnownSymbol k, kv ~ '[ '(k, v) ]) => IsLabel k (v -> Map kv) where
    fromLabel v = Cons proxy# v Nil

(<<>>) :: Map kvs -> Map kvs' -> Map (Merge '(kvs, kvs'))
m@(Cons p v mt) <<>> m'@(Cons p' v' mt') = case scompare p p' of
    SLT -> Cons p v (mt <<>> m')
    SEQ -> error impossible
    SGT -> Cons p' v' (m <<>> mt')
    where
    impossible = unwords
        ["The impossible happened: duplicate key"
        , symbolVal' p
        , "in (<<>>)), but no type error!"
        ]
Nil <<>> m' = m'
m <<>> Nil = m

IsLabel 实例让我们可以为 artist 字段中具有 String "Mazouni" 的映射编写 #artist "Mazouni"(<<>>) 操作合并字段;例如,#title "Ecoute moi Camarade" <<>> #artist "Mazouni" 表示一个包含两个字段的数据结构。检查它的类型——artist 已经排在 title:

之前
> :t #title "Ecoute moi Camarade" <<>> #artist "Mazouni"
#title "Ecoute moi Camarade" <<>> #artist "Mazouni"
  :: Map '[ '("artist", [Char]), '("title", [Char])]
> :t #artist "Mazouni" <<>> #title "Ecoute moi Camarade"
#artist "Mazouni" <<>> #title "Ecoute moi Camarade"
  :: Map '[ '("artist", [Char]), '("title", [Char])]

如果用户不小心两次包含相同的字段,他们在使用映射时会收到错误消息:

> f :: Map '[] -> (); f _ = ()
> f (#artist "Mazouni" <<>> #title "Ecoute moi Camarade" <<>> #artist "Bray")
    • Duplicate key "artist" in Merge
    • In the first argument of ‘f’, namely
        ‘(#artist "Mazouni" <<>> #title "Ecoute moi Camarade"
            <<>> #artist "Bray")’
      In the expression:
        f (#artist "Mazouni" <<>> #title "Ecoute moi Camarade"
             <<>> #artist "Bray")
      In an equation for ‘it’:
          it
            = f (#artist "Mazouni" <<>> #title "Ecoute moi Camarade"
                   <<>> #artist "Bray")

接下来我们实现查找。当我们在其中一个映射中查找一个字段时,我们将期望它在映射的使用者中具有特定类型。所以我们需要一种方法来检查用户提供的映射的类型是否与我们期望的类型兼容。以下是我们的做法:

type family AllCompatible kvs kvs' where
    AllCompatible '[] kvs' = CTrue
    AllCompatible ('(k, v):kvs) kvs' = (Compatible k v kvs', AllCompatible kvs kvs')

type family Compatible k v kvs where
    Compatible k v '[] = CTrue
    Compatible k v ('(k', v'):kvs) = CataOrdering (CmpSymbol k k')
        CTrue (v ~ v') (Compatible k v kvs)

type CTrue = () :: Constraint

type family LookupRaw k kvs kvsOriginal where
    LookupRaw k '[] kvsO = MissingKey k kvsO
    LookupRaw k ('(k', v):kvs) kvsO = CataOrdering (CmpSymbol k k')
        (MissingKey k kvsO)
        v
        (LookupRaw k kvs kvsO)

type family MissingKey k kvs where
    MissingKey k kvs = TypeError
        (    Text "Missing key in Lookup"
        :$$: Text "\tKey: " :<>: ShowType k
        :$$: Text "\tMapping: " :<>: ShowType kvs
        )

type Lookup k kvs = LookupRaw k kvs kvs

Compatible 检查某些字段是否具有某些类型(或缺失——这是允许的); Lookup 从我们预期的字段映射中获取预期的类型。这是术语级查找例程(称为 search 因为 lookupPrelude 占用):

search :: forall kvs k kvs'.
    (KnownSymbol k, Compatible k (Lookup k kvs) kvs') =>
    Map kvs' -> Maybe (Lookup k kvs)
search Nil = Nothing
search (Cons p v mt) = case scompare (proxy# @k) p of
    SLT -> Nothing
    SEQ -> Just v
    SGT -> search @kvs @k mt

您应该将上述所有内容视为一种小型图书馆。他们一劳永逸地完成了。下一位是您在何处使用它以及您关心的应用程序参数。因此,例如,对于您在问题中描述的字段,您可以这样写:

-- calling Sort is defensive programming, in case some future idiot
-- (possibly you!) adds a field in the wrong order
type Opts = Sort
    [ '("artist", String)
    , '("title", String)
    , '("year", Integer)
    ]

showReq :: AllCompatible Opts opts => Map opts -> String
showReq opts = unwords
    [ fromMaybe "<no artist>" (search @Opts @"artist" opts)
    , fromMaybe "<no title>" (search @Opts @"title" opts)
    , maybe "<no year>" show (search @Opts @"year" opts)
    ]

showReq 的实现由编译器检查其字段;例如,如果您不小心写了,请说:

showReq :: AllCompatible Opts opts => Map opts -> String
showReq = search @Opts @"aritst"

你会得到一个错误:

    • Missing key in Lookup
        Key: "aritst"
        Mapping: '[ '("artist", [Char]), '("title", [Char]),
                   '("year", Integer)]
    • In the expression: search @Opts @"aritst"
      In an equation for ‘showReq’: showReq = search @Opts @"aritst"

以下是用户使用 showReq 的情况:

> showReq (#artist "Mazouni" <<>> #title "Ecoute moi Camarade")
"Mazouni Ecoute moi Camarade <no year>"
> showReq (#year 1974)
"<no artist> <no title> 1974"

...不幸的是,在当前的实施中,最终用户无法避免拼写错误:

> showReq (#aritst "Mazouni")
"<no artist> <no title> <no year>"

这是可以解决的,但我已经摆脱了类型级别的黑客攻击。我鼓励你尝试一下!