幽灵类型背后的动机？

Question

Don Stewart 的 Haskell in the Large 的演讲提到了 幻影类型:

data Ratio n = Ratio Double
1.234 :: Ratio D3

data Ask ccy = Ask Double
Ask 1.5123 :: Ask GBP

我阅读了他关于它们的要点，但我不理解它们。另外，我阅读了关于该主题的Haskell Wiki。然而我还是没明白他们的意思。

使用幻影类型的动机是什么？

Answer 1

使用虚类型的动机是专门化 return 类型的数据构造函数。例如，考虑：

data List a = Nil | Cons a (List a)

Nil 和 Cons 的 return 类型默认为 List a（对所有 a 类型的列表通用）。

Nil  ::                List a
Cons :: a -> List a -> List a
                       |____|
                          |
                    -- return type is generalized

还要注意 Nil 是一个幻象构造函数（即它的 return 类型不依赖于它的参数，在这种情况下是空洞的，但仍然是一样的）。

因为 Nil 是一个幻象构造函数，我们可以将 Nil 专门化为我们想要的任何类型（例如 Nil :: List Int 或 Nil :: List Char）。

Haskell 中的普通代数数据类型允许您选择数据构造函数的参数类型。例如，我们为上面的 Cons 选择了参数类型（a 和 List a）。

但是，它不允许您选择 return 类型的数据构造函数。 return 类型总是泛化的。这对大多数情况都很好。但是，也有例外。例如：

data Expr a = Number     Int
            | Boolean    Bool
            | Increment (Expr Int)
            | Not       (Expr Bool)

数据构造函数的类型是：

Number    :: Int       -> Expr a
Boolean   :: Bool      -> Expr a
Increment :: Expr Int  -> Expr a
Not       :: Expr Bool -> Expr a

如您所见，所有数据构造函数的return类型都被泛化了。这是有问题的，因为我们知道 Number 和 Increment 必须总是 return 而 Expr Int 和 Boolean 和 Not 必须总是 return一个 Expr Bool.

数据构造函数的return类型是错误的，因为它们太笼统了。例如，Number 不可能 return 和 Expr a 但它确实如此。这允许您编写类型检查器不会捕获的错误表达式。例如：

Increment (Boolean False) -- you shouldn't be able to increment a boolean
Not       (Number  0)     -- you shouldn't be able to negate a number

问题是我们无法指定 return 类型的数据构造函数。

请注意 Expr 的所有数据构造函数都是虚构构造函数（即它们的 return 类型不依赖于它们的参数）。构造函数都是虚构造函数的数据类型称为虚类型。

请记住，像 Nil 这样的 return 类型的幻像构造函数可以专门化为我们想要的任何类型。因此，我们可以为 Expr 创建智能构造函数，如下所示：

number    :: Int       -> Expr Int
boolean   :: Bool      -> Expr Bool
increment :: Expr Int  -> Expr Int
not       :: Expr Bool -> Expr Bool

number    = Number
boolean   = Boolean
increment = Increment
not       = Not

现在我们可以使用智能构造函数代替普通构造函数，我们的问题就解决了：

increment (boolean False) -- error
not       (number  0)     -- error

因此，当您想专门化数据构造函数的 return 类型时，幻像构造函数很有用，而幻像类型是其构造函数都是幻像构造函数的数据类型。

请注意，像 Left 和 Right 这样的数据构造函数也是幻象构造函数：

data Either a b = Left a | Right b

Left  :: a -> Either a b
Right :: b -> Either a b

原因是尽管这些数据构造函数的 return 类型确实依赖于它们的参数，但它们仍然是泛化的，因为它们仅部分依赖于它们的参数。

判断数据构造函数是否为虚幻构造函数的简单方法：

Do all the type variables appearing in the return type of the data constructor also appear in the arguments of the data constructor? If yes, it's not a phantom constructor.

希望对您有所帮助。

Answer 2

回答"what's the motivation to use a phantom type"。有两点：

使无效状态无法表示，这在
携带一些类型级别的信息

例如，您可以用长度单位标记距离：

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

newtype Distance a = Distance Double
  deriving (Num, Show)

data Kilometer
data Mile

marathonDistance :: Distance Kilometer
marathonDistance = Distance 42.195

distanceKmToMiles :: Distance Kilometer -> Distance Mile
distanceKmToMiles (Distance km) = Distance (0.621371 * km)

marathonDistanceInMiles :: Distance Mile
marathonDistanceInMiles = distanceKmToMiles marathonDistance

你可以避免 Mars Climate Orbiter disaster:

>>> marathonDistanceInMiles
Distance 26.218749345

>>> marathonDistanceInMiles + marathonDistance

<interactive>:10:27:
    Couldn't match type ‘Kilometer’ with ‘Mile’
    Expected type: Distance Mile
      Actual type: Distance Kilometer
    In the second argument of ‘(+)’, namely ‘marathonDistance’
    In the expression: marathonDistanceInMiles + marathonDistance

此 "pattern" 略有不同。您可以使用 DataKinds 来获得封闭的单位集：

{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE DataKinds #-}

data LengthUnit = Kilometer | Mile

newtype Distance (a :: LengthUnit) = Distance Double
  deriving (Num, Show)

marathonDistance :: Distance 'Kilometer
marathonDistance = Distance 42.195

distanceKmToMiles :: Distance 'Kilometer -> Distance 'Mile
distanceKmToMiles (Distance km) = Distance (0.621371 * km)

marathonDistanceInMiles :: Distance 'Mile
marathonDistanceInMiles = distanceKmToMiles marathonDistance

它的工作方式类似：

>>> marathonDistanceInMiles
Distance 26.218749345

>>> marathonDistance + marathonDistance
Distance 84.39

>>> marathonDistanceInMiles + marathonDistance

<interactive>:28:27:
    Couldn't match type ‘'Kilometer’ with ‘'Mile’
    Expected type: Distance 'Mile
      Actual type: Distance 'Kilometer
    In the second argument of ‘(+)’, namely ‘marathonDistance’
    In the expression: marathonDistanceInMiles + marathonDistance

但是现在 Distance 只能以公里或英里为单位，我们以后不能添加更多单位。这在某些用例中可能很有用。

我们也可以这样做：

data Distance = Distance { distanceUnit :: LengthUnit, distanceValue :: Double }
   deriving (Show)

在距离的情况下，我们可以计算出加法，例如，如果涉及到不同的单位，则转换为公里。但这对 currencies 来说效果不佳，它的比率随时间变化等

并且可以使用 GADT 代替，在某些情况下这可能是更简单的方法：

{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}

data Kilometer
data Mile

data Distance a where
  KilometerDistance :: Double -> Distance Kilometer
  MileDistance :: Double -> Distance Mile

deriving instance Show (Distance a)

marathonDistance :: Distance Kilometer
marathonDistance = KilometerDistance 42.195

distanceKmToMiles :: Distance Kilometer -> Distance Mile
distanceKmToMiles (KilometerDistance km) = MileDistance (0.621371 * km)

marathonDistanceInMiles :: Distance Mile
marathonDistanceInMiles = distanceKmToMiles marathonDistance

现在我们在数值层面上也知道了单位：

>>> marathonDistanceInMiles 
MileDistance 26.218749345

这种方法特别大大简化了 Expr a 来自的示例：

{-# LANGUAGE GADTs #-}

data Expr a where
  Number     :: Int -> Expr Int
  Boolean    :: Bool -> Expr Bool
  Increment  :: Expr Int -> Expr Int
  Not        :: Expr Bool -> Expr Bool

值得指出的是，后一种变体需要重要的语言扩展（GADTs、DataKinds、KindSignatures），您的编译器可能不支持这些扩展。 Mu 编译器 Don 提到的可能就是这种情况。

Answer 3

具体来说，Ratio D3，我们使用丰富的类型来驱动类型导向代码，例如如果您在某处有一个类型为 Ratio D3 的字段，它的编辑器将被分派到一个文本字段，该文本字段仅接受数字条目并显示 3 位精度。这与 newtype Amount = Amount Double 形成对比，我们不显示十进制数字，而是使用千位逗号并将像 '10m' 这样的输入解析为 '10,000,000'。

在底层表示中，两者仍然只是 Doubles。

幽灵类型背后的动机？

Motivation behind Phantom Types?

haskell

phantom-types