Scala with cats 在 Kleisli 上执行数据验证,为什么我的代码快速失败而不是累积错误?

Scala with cats exercise Data validation on Kleisli, why my code fails fast instead of accumulating errors?

我正在阅读 scala-with-cats 一书并按照其中的练习进行操作。来到case study: data validation的时候遇到了一些问题

这是我的全部代码(和书上的一样):

package org.scala.ch10.final_recap

import cats.Semigroup
import cats.data.Validated
import cats.data.Validated._
import cats.data.Kleisli
import cats.data.NonEmptyList
import cats.instances.either._
import cats.syntax.apply._
import cats.syntax.semigroup._
import cats.syntax.validated._

sealed trait Predicate[E, A] {
  import Predicate._
  def and(that: Predicate[E, A]): Predicate[E, A] =
    And(this, that)

  def or(that: Predicate[E, A]): Predicate[E, A] =
    Or(this, that)

  /**
   * This part is for Kleislis
   * @return
   */
  def run(implicit s: Semigroup[E]): A => Either[E, A] =
    (a: A) => this(a).toEither

  def apply(a: A)(implicit s: Semigroup[E]): Validated[E, A] =
    this match {
      case Pure(func) =>
        func(a)
      case And(left, right) => (left(a), right(a)).mapN((_, _) => a)
      case Or(left, right) =>
        left(a) match {
          case Valid(_) => Valid(a)
          case Invalid(e1) =>
            right(a) match {
              case Valid(_) => Invalid(e1)
              case Invalid(e2) => Invalid(e1 |+| e2)
            }
        }
    }
}

object Predicate {
  final case class And[E, A](left: Predicate[E, A], right: Predicate[E, A]) extends Predicate[E, A]
  final case class Or[E, A](left: Predicate[E, A], right: Predicate[E, A]) extends Predicate[E, A]
  final case class Pure[E, A](func: A => Validated[E, A]) extends Predicate[E, A]
  def apply[E, A](f: A => Validated[E, A]): Predicate[E, A] = Pure(f)
  def lift[E, A](err: E, fn: A => Boolean): Predicate[E, A] = Pure(a => if(fn(a)) a.valid else err.invalid)
}

object FinalRecapPredicate {
  type Errors = NonEmptyList[String]
  def error(s: String): NonEmptyList[String] = NonEmptyList(s, Nil)
  type Result[A] = Either[Errors, A]
  type Check[A, B] = Kleisli[Result, A, B]
  def check[A, B](func: A => Result[B]): Check[A, B] = Kleisli(func)
  def checkPred[A](pred: Predicate[Errors, A]): Check[A, A] =
    Kleisli[Result, A, A](pred.run)

  def longerThan(n: Int): Predicate[Errors, String] =
    Predicate.lift(
      error(s"Must be longer than $n characters"),
      str => str.length > n
    )

  val alphanumeric: Predicate[Errors, String] =
    Predicate.lift(
      error(s"Must be all alphanumeric characters"),
      str => str.forall(_.isLetterOrDigit)
    )

  def contains(char: Char): Predicate[Errors, String] =
    Predicate.lift(
      error(s"Must contain the character $char"),
      str => str.contains(char)
    )

  def containsOnce(char: Char): Predicate[Errors, String] =
    Predicate.lift(
      error(s"Must contain the character $char only once"),
      str => str.count(_ == char) == 1
    )

  val checkUsername: Check[String, String] = checkPred(longerThan(3) and alphanumeric)

  val splitEmail: Check[String, (String, String)] = check(_.split('@') match {
    case Array(name, domain) =>
      Right((name, domain))
    case _ =>
      Left(error("Must contain a single @ character"))
  })

  val checkLeft: Check[String, String] = checkPred(longerThan(0))

  val checkRight: Check[String, String] = checkPred(longerThan(3) and contains('.'))

  val joinEmail: Check[(String, String), String] =
    check {
      case (l, r) => (checkLeft(l), checkRight(r)).mapN(_ + "@" + _)
    }

  val checkEmail: Check[String, String] = splitEmail andThen joinEmail

  final case class User(username: String, email: String)

  def createUser(username: String, email: String): Either[Errors, User] =
    (checkUsername.run(username),
      checkEmail.run(email)).mapN(User)

  def main(args: Array[String]): Unit = {
    println(createUser("", "noel@underscore.io@io"))
  }
}

它假设代码应该以错误消息结束 Left(NonEmptyList(Must be longer than 3 characters), Must contain a single @ character) 但我实际上是 Left(NonEmptyList(Must be longer than 3 characters))

显然,它没有按预期工作。它快速失败而不是累积错误......如何解决这个问题? (我已经花了好几个小时了,还是找不到解决方法)

这是“有问题”的部分:

def createUser(username: String, email: String): Either[Errors, User] =
  (checkUsername.run(username),
    checkEmail.run(email)).mapN(User)

您正在组合 Result 个元组,其中

type Result[A] = Either[Errors, A]

这意味着您实际上是在一对 Either 上执行 mapN,这是 Semigroupal 类型 class 提供的操作。此操作将不会 累积结果。

这有几个原因,但我发现特别重要的一个原因是,如果我们发现自己使用的半群/应用恰好也是 Monad,则可以保留行为。为什么这是个问题?因为 Monad 是排序操作,使每个步骤都依赖于前一个步骤,并且具有“早期失败”语义。使用某些 Monad 时,人们可能希望在使用来自底层 Applicative 的构造时保留这些语义(每个 Monad 也是一个 Applicative)。在那种情况下,如果 Applicative 的实现使用“累积”语义而不是“早期失败”语义,我们将破坏一些重要的法则,例如引用透明性。

您可以使用 mapN 的并行版本,称为 parMapN,其合同保证实施将并行评估所有结果。这意味着它绝对不能期望具有“早期失败”语义,并且在这种情况下积累结果就可以了。

请注意 Validated 也会累积结果,通常在 NonEmptyListNonEmptyChain 中。这可能就是您期望看到累积结果的原因;唯一的问题是,您没有在代码的“有问题”部分使用 Validated 值,而是使用原始 Eithers。

下面是一些演示上述概念的简单代码:

import cats.data._
import cats.implicits._

val l1: Either[String, Int] = Left("foo")
val l2: Either[String, Int] = Left("bar")

(l1, l2).mapN(_ + _) 
// Left(foo)

(l1, l2).parMapN(_ + _) 
// Left(foobar)

val v1: ValidatedNel[String, Int] = l1.toValidatedNel
val v2: ValidatedNel[String, Int] = l2.toValidatedNel

(v1, v2).mapN(_ + _) 
// Invalid(NonEmptyList(foo, bar))