我应该对从数据库中检索到的数据的有效性给予多大的信任？

Question

问我问题的其他方式是："Should I keep the data types coming from database as simple and raw as I would ask them from my REST endpoint"

想象一下这种情况 class 我想将其作为一行存储在数据库中：

case class Product(id: UUID,name: String, price: BigInt)

它显然不是也不应该像它所说的那样，因为 name 和 price 的类型签名是一个谎言。

所以我们所做的是创建自定义数据类型，以更好地表示事物，例如：（为了简单起见，假设我们唯一关心的是 price 数据类型）

case class Price(value: BigInt) {
  require(value > BigInt(0))
}

object Price {
  def validate(amount: BigInt): Either[String,Price] = 
    Try(Price(amount)).toOption.toRight("invalid.price")
}

//As a result my Product class is now:
case class Product(id: UUID,name: String,price: Price)

所以现在为产品数据获取用户输入的过程如下所示：

//this class would be parsed from i.e a form:
case class ProductInputData(name: String, price: BigInt)

def create(input: ProductInputData) = {
  for {
    validPrice <- Price.validate(input.price)
  } yield productsRepo.insert(
      Product(id = UUID.randomUUID,name = input.name,price = ???)
    )
}

看看三重问号 (???)。从整个应用程序架构的角度来看，这是我的主要关注点；如果我能够在数据库中将列存储为 Price（例如 slick 支持这些自定义数据类型），那么这意味着我可以选择将价格存储为 price : BigInt = validPrice.value或 price: Price = validPrice.

我在这两个决定中看到了太多的利弊，但我无法做出决定。以下是我认为支持每个选择的论点：

将数据存储为简单的数据库类型（即BigInt）因为:

性能：x > 0 对 Price 的创建的简单断言是微不足道的，但假设您想验证自定义 Email 使用复杂的正则表达式键入。这对检索集合是有害的
对腐败的容忍度：如果将 BigInt 作为负值插入，则每次您的应用程序尝试简单地阅读专栏并将其扔到用户界面上。但是，如果它被检索然后涉及某些域层处理（例如购买），则会导致问题。

将数据存储为域丰富类型（即 Price），因为:

没有隐含的推理和信任：系统中其他地方的其他方法需要价格有效。例如：

//two terrible variations of a calculateDiscount method:

//this version simply trusts that price is already valid and came from db:
def calculateDiscount(price: BigInt): BigInt = {
  //apply some positive coefficient to price and hopefully get a positive 
  //number from it and if it's not positive because price is not positive then 
  //it'll explode in your face.
}

//this version is even worse. It does retain function totality and purity
//but the unforgivable culture it encourages is the kind of defensive and 
//pranoid programming that causes every developer to write some guard 
//expressions performing duplicated validation All over!
def calculateDiscount(price: BigInt): Option[BigInt] = {
  if (price <= BigInt(0)) 
    None
  else 
  Some{
   //Do safe processing
  }
} 

//ideally you want it to look like this:
def calculateDiscount(price: Price): Price

没有域类型到简单类型的常量转换，反之亦然：用于表示、存储、域层等；您只需在系统中拥有一个代表来统治他们。

我看到的所有这些混乱的根源是数据库。如果数据来自用户，那就很容易了：你永远不会相信它是有效的。您要求简单的数据类型将它们转换为经过验证的域类型，然后继续。但不是数据库。现代分层架构是否以某种明确或至少缓解的方式解决了这个问题？

Answer 1

保护数据库的完整性。就像保护对象内部状态的完整性一样。
信任数据库。检查并重新检查已经检查过的内容是没有意义的。
尽可能多地使用域对象。等到最后一刻放弃它们（原始 JDBC 代码或在数据呈现之前）。
不容忍损坏的数据。如果数据损坏，应用程序应该崩溃。否则它可能会产生更多损坏的数据。

从数据库检索时 require 调用的开销可以忽略不计。如果您真的认为这是一个问题，请提供 2 个构造函数，一个用于来自用户的数据（执行验证），另一个假设数据是好的（用于数据库代码）。

我喜欢指向错误的异常（由于验证不足导致数据损坏）。

也就是说，我经常在代码中留下 requires 以帮助捕获更复杂验证中的错误（可能来自多个表的数据以某种无效方式组合）。系统仍然崩溃（它应该崩溃），但我收到了更好的错误消息。

我应该对从数据库中检索到的数据的有效性给予多大的信任？

How much trust should I put in the validity of retrieved data from database?

database

architecture

trust

domain-driven-design

scala