在可选字段上使用错误的模式解析 JSON 时引发异常

Raise exception while parsing JSON with the wrong schema on the Optional field

在 JSON 解析期间,我想捕获与我的情况不同的可选顺序文件的异常 class。 让我详细说明

我有以下情况class:

case class SimpleFeature(
  column: String,
  valueType: String,
  nullValue: String,
  func: Option[String])

case class TaskConfig(
  taskInfo: TaskInfo,
  inputType: String,
  training: Table,
  testing: Table,
  eval: Table,
  splitStrategy: SplitStrategy,
  label: Label,
  simpleFeatures: Option[List[SimpleFeature]],
  model: Model,
  evaluation: Evaluation,
  output: Output)

这是 JSON 文件的一部分,我想指出:

"simpleFeatures": [
  {
    "column": "pcat_id",
    "value": "categorical",
    "nullValue": "DUMMY"
  },
  {
    "column": "brand_code",
    "valueType": "categorical",
    "nullValue": "DUMMY"
  }
]

如您所见,第一个元素在架构中有错误,在解析时,我想引发错误。同时,我想保留可选行为,以防没有要解析的对象。

我研究了一段时间的一个想法 - 创建自定义序列化程序并手动检查字段,但不确定我是否在正确的轨道上

object JSONSerializer extends CustomKeySerializer[SimpleFeatures](format => {
  case jsonObj: JObject => {
    case Some(simplFeatures (jsonObj \ "simpleFeatures")) => {
    // Extraction logic goes here
    }
  }
})

我可能不太精通 Scala 和 json4s,所以欢迎任何建议。

json4s version
3.2.10

scala version
2.11.12

jdk version
1.8.0

您可以尝试使用 play.api.libs.json

"com.typesafe.play" %% "play-json" % "2.7.2",
"net.liftweb" % "lift-json_2.11" % "2.6.2" 

您只需要定义 case class 和格式化程序。

示例:

case class Example(a: String, b: String)

implicit val formats: DefaultFormats.type = DefaultFormats
implicit val instancesFormat= Json.format[Example]

然后就这样做:

Json.parse(jsonData).asOpt[Example]

如果上面给出了一些错误:尝试在您的依赖项中也添加 "net.liftweb" % "lift-json_2.11" % "2.6.2"

我认为您需要扩展 CustomSerializer class,因为 CustomKeySerializer 它用于实现 JSON 键的自定义逻辑:

import org.json4s.{CustomSerializer, MappingException}
import org.json4s.JsonAST._
import org.json4s.JsonDSL._
import org.json4s.jackson.JsonMethods._

case class SimpleFeature(column: String,
                          valueType: String,
                          nullValue: String,
                          func: Option[String])

case class TaskConfig(simpleFeatures: Option[Seq[SimpleFeature]])

object Main extends App {

implicit val formats = new DefaultFormats {
    override val strictOptionParsing: Boolean = true
  } + new SimpleFeatureSerializer()

  class SimpleFeatureSerializer extends CustomSerializer[SimpleFeature](_ => ( {
    case jsonObj: JObject =>
      val requiredKeys = Set[String]("column", "valueType", "nullValue")

      val diff = requiredKeys.diff(jsonObj.values.keySet)
      if (diff.nonEmpty)
        throw new MappingException(s"Fields [${requiredKeys.mkString(",")}] are mandatory. Missing fields: [${diff.mkString(",")}]")

      val column = (jsonObj \ "column").extract[String]
      val valueType = (jsonObj \ "valueType").extract[String]
      val nullValue = (jsonObj \ "nullValue").extract[String]
      val func = (jsonObj \ "func").extract[Option[String]]

      SimpleFeature(column, valueType, nullValue, func)
  }, {
    case sf: SimpleFeature =>
      ("column" -> sf.column) ~
        ("valueType" -> sf.valueType) ~
        ("nullValue" -> sf.nullValue) ~
        ("func" -> sf.func)
  }
  ))

  // case 1: Test single feature
  val singleFeature  = """
          {
              "column": "pcat_id",
              "valueType": "categorical",
              "nullValue": "DUMMY"
          }
      """
  val singleFeatureValid = parse(singleFeature).extract[SimpleFeature]
  println(singleFeatureValid)
  //  SimpleFeature(pcat_id,categorical,DUMMY,None)

  // case 2: Test task config
  val taskConfig  = """{
      "simpleFeatures": [
        {
          "column": "pcat_id",
          "valueType": "categorical",
          "nullValue": "DUMMY"
        },
        {
          "column": "brand_code",
          "valueType": "categorical",
          "nullValue": "DUMMY"
        }]
  }"""

  val taskConfigValid = parse(taskConfig).extract[TaskConfig]
  println(taskConfigValid)
  //  TaskConfig(List(SimpleFeature(pcat_id,categorical,DUMMY,None), SimpleFeature(brand_code,categorical,DUMMY,None)))

  // case 3: Invalid json
  val invalidSingleFeature  = """
          {
              "column": "pcat_id",
              "value": "categorical",
              "nullValue": "DUMMY"
          }
      """
  val singleFeatureInvalid = parse(invalidSingleFeature).extract[SimpleFeature]
  // throws MappingException
}

分析:这里的主要问题是如何获得对 jsonObj 的密钥的访问权限,以便检查是否存在无效或丢失的密钥,一种方法实现这一点是通过 jsonObj.values.keySet。对于实现,首先我们将必填字段分配给 requiredKeys 变量,然后我们将 requiredKeys 与当前存在的 requiredKeys.diff(jsonObj.values.keySet) 进行比较。如果差异不为空,则意味着缺少必填字段,在这种情况下,我们会抛出包含必要信息的异常。

注意 1: 我们不应该忘记将新的序列化器添加到可用格式中。

注2:我们抛出一个MappingException的实例,json4s在解析JSON字符串时已经在内部使用了这个实例。

更新

为了强制验证选项字段,您需要通过覆盖相应的方法将 strictOptionParsing 选项设置为 true:

implicit val formats = new DefaultFormats {
    override val strictOptionParsing: Boolean = true
  } + new SimpleFeatureSerializer()

资源

https://nmatpt.com/blog/2017/01/29/json4s-custom-serializer/

https://danielasfregola.com/2015/08/17/spray-how-to-deserialize-entities-with-json4s/

https://www.programcreek.com/scala/org.json4s.CustomSerializer