在可选字段上使用错误的模式解析 JSON 时引发异常
Raise exception while parsing JSON with the wrong schema on the Optional field
在 JSON 解析期间,我想捕获与我的情况不同的可选顺序文件的异常 class。
让我详细说明
我有以下情况class:
case class SimpleFeature(
column: String,
valueType: String,
nullValue: String,
func: Option[String])
case class TaskConfig(
taskInfo: TaskInfo,
inputType: String,
training: Table,
testing: Table,
eval: Table,
splitStrategy: SplitStrategy,
label: Label,
simpleFeatures: Option[List[SimpleFeature]],
model: Model,
evaluation: Evaluation,
output: Output)
这是 JSON 文件的一部分,我想指出:
"simpleFeatures": [
{
"column": "pcat_id",
"value": "categorical",
"nullValue": "DUMMY"
},
{
"column": "brand_code",
"valueType": "categorical",
"nullValue": "DUMMY"
}
]
如您所见,第一个元素在架构中有错误,在解析时,我想引发错误。同时,我想保留可选行为,以防没有要解析的对象。
我研究了一段时间的一个想法 - 创建自定义序列化程序并手动检查字段,但不确定我是否在正确的轨道上
object JSONSerializer extends CustomKeySerializer[SimpleFeatures](format => {
case jsonObj: JObject => {
case Some(simplFeatures (jsonObj \ "simpleFeatures")) => {
// Extraction logic goes here
}
}
})
我可能不太精通 Scala 和 json4s,所以欢迎任何建议。
json4s version
3.2.10
scala version
2.11.12
jdk version
1.8.0
您可以尝试使用 play.api.libs.json
"com.typesafe.play" %% "play-json" % "2.7.2",
"net.liftweb" % "lift-json_2.11" % "2.6.2"
您只需要定义 case class 和格式化程序。
示例:
case class Example(a: String, b: String)
implicit val formats: DefaultFormats.type = DefaultFormats
implicit val instancesFormat= Json.format[Example]
然后就这样做:
Json.parse(jsonData).asOpt[Example]
如果上面给出了一些错误:尝试在您的依赖项中也添加 "net.liftweb" % "lift-json_2.11" % "2.6.2"
。
我认为您需要扩展 CustomSerializer
class,因为 CustomKeySerializer
它用于实现 JSON 键的自定义逻辑:
import org.json4s.{CustomSerializer, MappingException}
import org.json4s.JsonAST._
import org.json4s.JsonDSL._
import org.json4s.jackson.JsonMethods._
case class SimpleFeature(column: String,
valueType: String,
nullValue: String,
func: Option[String])
case class TaskConfig(simpleFeatures: Option[Seq[SimpleFeature]])
object Main extends App {
implicit val formats = new DefaultFormats {
override val strictOptionParsing: Boolean = true
} + new SimpleFeatureSerializer()
class SimpleFeatureSerializer extends CustomSerializer[SimpleFeature](_ => ( {
case jsonObj: JObject =>
val requiredKeys = Set[String]("column", "valueType", "nullValue")
val diff = requiredKeys.diff(jsonObj.values.keySet)
if (diff.nonEmpty)
throw new MappingException(s"Fields [${requiredKeys.mkString(",")}] are mandatory. Missing fields: [${diff.mkString(",")}]")
val column = (jsonObj \ "column").extract[String]
val valueType = (jsonObj \ "valueType").extract[String]
val nullValue = (jsonObj \ "nullValue").extract[String]
val func = (jsonObj \ "func").extract[Option[String]]
SimpleFeature(column, valueType, nullValue, func)
}, {
case sf: SimpleFeature =>
("column" -> sf.column) ~
("valueType" -> sf.valueType) ~
("nullValue" -> sf.nullValue) ~
("func" -> sf.func)
}
))
// case 1: Test single feature
val singleFeature = """
{
"column": "pcat_id",
"valueType": "categorical",
"nullValue": "DUMMY"
}
"""
val singleFeatureValid = parse(singleFeature).extract[SimpleFeature]
println(singleFeatureValid)
// SimpleFeature(pcat_id,categorical,DUMMY,None)
// case 2: Test task config
val taskConfig = """{
"simpleFeatures": [
{
"column": "pcat_id",
"valueType": "categorical",
"nullValue": "DUMMY"
},
{
"column": "brand_code",
"valueType": "categorical",
"nullValue": "DUMMY"
}]
}"""
val taskConfigValid = parse(taskConfig).extract[TaskConfig]
println(taskConfigValid)
// TaskConfig(List(SimpleFeature(pcat_id,categorical,DUMMY,None), SimpleFeature(brand_code,categorical,DUMMY,None)))
// case 3: Invalid json
val invalidSingleFeature = """
{
"column": "pcat_id",
"value": "categorical",
"nullValue": "DUMMY"
}
"""
val singleFeatureInvalid = parse(invalidSingleFeature).extract[SimpleFeature]
// throws MappingException
}
分析:这里的主要问题是如何获得对 jsonObj
的密钥的访问权限,以便检查是否存在无效或丢失的密钥,一种方法实现这一点是通过 jsonObj.values.keySet
。对于实现,首先我们将必填字段分配给 requiredKeys
变量,然后我们将 requiredKeys
与当前存在的 requiredKeys.diff(jsonObj.values.keySet)
进行比较。如果差异不为空,则意味着缺少必填字段,在这种情况下,我们会抛出包含必要信息的异常。
注意 1: 我们不应该忘记将新的序列化器添加到可用格式中。
注2:我们抛出一个MappingException的实例,json4s在解析JSON字符串时已经在内部使用了这个实例。
更新
为了强制验证选项字段,您需要通过覆盖相应的方法将 strictOptionParsing
选项设置为 true:
implicit val formats = new DefaultFormats {
override val strictOptionParsing: Boolean = true
} + new SimpleFeatureSerializer()
资源
https://nmatpt.com/blog/2017/01/29/json4s-custom-serializer/
https://danielasfregola.com/2015/08/17/spray-how-to-deserialize-entities-with-json4s/
https://www.programcreek.com/scala/org.json4s.CustomSerializer
在 JSON 解析期间,我想捕获与我的情况不同的可选顺序文件的异常 class。 让我详细说明
我有以下情况class:
case class SimpleFeature(
column: String,
valueType: String,
nullValue: String,
func: Option[String])
case class TaskConfig(
taskInfo: TaskInfo,
inputType: String,
training: Table,
testing: Table,
eval: Table,
splitStrategy: SplitStrategy,
label: Label,
simpleFeatures: Option[List[SimpleFeature]],
model: Model,
evaluation: Evaluation,
output: Output)
这是 JSON 文件的一部分,我想指出:
"simpleFeatures": [
{
"column": "pcat_id",
"value": "categorical",
"nullValue": "DUMMY"
},
{
"column": "brand_code",
"valueType": "categorical",
"nullValue": "DUMMY"
}
]
如您所见,第一个元素在架构中有错误,在解析时,我想引发错误。同时,我想保留可选行为,以防没有要解析的对象。
我研究了一段时间的一个想法 - 创建自定义序列化程序并手动检查字段,但不确定我是否在正确的轨道上
object JSONSerializer extends CustomKeySerializer[SimpleFeatures](format => {
case jsonObj: JObject => {
case Some(simplFeatures (jsonObj \ "simpleFeatures")) => {
// Extraction logic goes here
}
}
})
我可能不太精通 Scala 和 json4s,所以欢迎任何建议。
json4s version
3.2.10
scala version
2.11.12
jdk version
1.8.0
您可以尝试使用 play.api.libs.json
"com.typesafe.play" %% "play-json" % "2.7.2",
"net.liftweb" % "lift-json_2.11" % "2.6.2"
您只需要定义 case class 和格式化程序。
示例:
case class Example(a: String, b: String)
implicit val formats: DefaultFormats.type = DefaultFormats
implicit val instancesFormat= Json.format[Example]
然后就这样做:
Json.parse(jsonData).asOpt[Example]
如果上面给出了一些错误:尝试在您的依赖项中也添加 "net.liftweb" % "lift-json_2.11" % "2.6.2"
。
我认为您需要扩展 CustomSerializer
class,因为 CustomKeySerializer
它用于实现 JSON 键的自定义逻辑:
import org.json4s.{CustomSerializer, MappingException}
import org.json4s.JsonAST._
import org.json4s.JsonDSL._
import org.json4s.jackson.JsonMethods._
case class SimpleFeature(column: String,
valueType: String,
nullValue: String,
func: Option[String])
case class TaskConfig(simpleFeatures: Option[Seq[SimpleFeature]])
object Main extends App {
implicit val formats = new DefaultFormats {
override val strictOptionParsing: Boolean = true
} + new SimpleFeatureSerializer()
class SimpleFeatureSerializer extends CustomSerializer[SimpleFeature](_ => ( {
case jsonObj: JObject =>
val requiredKeys = Set[String]("column", "valueType", "nullValue")
val diff = requiredKeys.diff(jsonObj.values.keySet)
if (diff.nonEmpty)
throw new MappingException(s"Fields [${requiredKeys.mkString(",")}] are mandatory. Missing fields: [${diff.mkString(",")}]")
val column = (jsonObj \ "column").extract[String]
val valueType = (jsonObj \ "valueType").extract[String]
val nullValue = (jsonObj \ "nullValue").extract[String]
val func = (jsonObj \ "func").extract[Option[String]]
SimpleFeature(column, valueType, nullValue, func)
}, {
case sf: SimpleFeature =>
("column" -> sf.column) ~
("valueType" -> sf.valueType) ~
("nullValue" -> sf.nullValue) ~
("func" -> sf.func)
}
))
// case 1: Test single feature
val singleFeature = """
{
"column": "pcat_id",
"valueType": "categorical",
"nullValue": "DUMMY"
}
"""
val singleFeatureValid = parse(singleFeature).extract[SimpleFeature]
println(singleFeatureValid)
// SimpleFeature(pcat_id,categorical,DUMMY,None)
// case 2: Test task config
val taskConfig = """{
"simpleFeatures": [
{
"column": "pcat_id",
"valueType": "categorical",
"nullValue": "DUMMY"
},
{
"column": "brand_code",
"valueType": "categorical",
"nullValue": "DUMMY"
}]
}"""
val taskConfigValid = parse(taskConfig).extract[TaskConfig]
println(taskConfigValid)
// TaskConfig(List(SimpleFeature(pcat_id,categorical,DUMMY,None), SimpleFeature(brand_code,categorical,DUMMY,None)))
// case 3: Invalid json
val invalidSingleFeature = """
{
"column": "pcat_id",
"value": "categorical",
"nullValue": "DUMMY"
}
"""
val singleFeatureInvalid = parse(invalidSingleFeature).extract[SimpleFeature]
// throws MappingException
}
分析:这里的主要问题是如何获得对 jsonObj
的密钥的访问权限,以便检查是否存在无效或丢失的密钥,一种方法实现这一点是通过 jsonObj.values.keySet
。对于实现,首先我们将必填字段分配给 requiredKeys
变量,然后我们将 requiredKeys
与当前存在的 requiredKeys.diff(jsonObj.values.keySet)
进行比较。如果差异不为空,则意味着缺少必填字段,在这种情况下,我们会抛出包含必要信息的异常。
注意 1: 我们不应该忘记将新的序列化器添加到可用格式中。
注2:我们抛出一个MappingException的实例,json4s在解析JSON字符串时已经在内部使用了这个实例。
更新
为了强制验证选项字段,您需要通过覆盖相应的方法将 strictOptionParsing
选项设置为 true:
implicit val formats = new DefaultFormats {
override val strictOptionParsing: Boolean = true
} + new SimpleFeatureSerializer()
资源
https://nmatpt.com/blog/2017/01/29/json4s-custom-serializer/
https://danielasfregola.com/2015/08/17/spray-how-to-deserialize-entities-with-json4s/
https://www.programcreek.com/scala/org.json4s.CustomSerializer