如何编写验证器方法以使用 python 验证 json 元素数据

how to write validator method to validate the json element data with python

我是 python 的新手,正在尝试使用 json 架构编写 python 脚本来验证巨大的 json 输出文件的架构。想要确保我的 json 文件中没有任何空值。

写了一个方法来读取模式 json 文件和输出 json 文件,现在将它们都传递给验证函数。 json 文件中有很多重复对象。然后意识到我应该编写验证器 function/class 来传递每个对象并在 loop.But 中继续验证它们卡在这里不知道该怎么做

{
  "id": "test",
  "name": "name",
  "cake_name": "test",
  "metric": 0.5,
  "anticipations": [
    {
      "time": "2018-01-01 00:00:00",
      "points": 0.49128797804879504,
      "top_properties": {
        "LA:TB2341": 0.23,
        "LA:TB2342": 0.23,
        "LA:TB2343": 0.23
      },
      "status": 0,
      "alert": false
    },
    {
      "time": "2018-01-02 00:00:00",
      "points": 0.588751186433263,
      "top_properties": {
        "LA:TB2342": 0.23,
        "LA:TB2341": 0.23,
        "LA:TB2344": 0.23
      },
      "status": 0,
      "alert": true
    }
  ]
}

PS: 相应的架构文件是从“https://jsonschema.net/”生成的,这是我的moduleschema.json 和以上 json 是 modelout.json

我写的代码只是为了读取文件:

def test_json(self):
    with open('/Users/moduleschema.json', 'r') as json_file:

        schema = json_file.read()
        print(schema)

    with open('/Users/modelout.json', 'r') as output_json:
        outputfile = output_json.read()

        print(outputfile)

    strt = jsonschema.validate(outputfile, schema)
    jsonschema.Draft4Validator
    print(strt)

我想解析 json 文件以确保所有字段都显示正确的类型(整数表示整数,字符串值表示字符串)。我是 python 的新手,如果这是一个愚蠢的问题,请原谅我。谢谢!

所以我将给出一个依赖于我非常喜欢的第三方包的答案。我没有参与其中,但我已经使用过它,它非常有用,尤其是对于此处的验证类型。

是的,您可以创建一个自定义验证器,例如

import json
import typing

# here json_data is the data in your question
def custom_validator(json_data: typing.Dict):
    string_attributes = ["id", "name", "cake_name", "status", "time", "LA:TB2342", "LA:TB2341", "LA:TB2344"]
    int_attributes = [...]
    float_attributes = [...]
    validations_errors = []
    for attribute in string_attributes:
        if attribute in json_data:
            if attribute in string_attributes and not isinstance(json_data.get(attribute), str):
                validations_errors.append(f"key {attribute} is not a string, got {json_data.get(attribute)}")
                ...

这很快就会失控。也许你可以花更多的时间把它弄得漂亮等等

但是,我强烈建议您继续阅读 dataclasses and pydantic

这是我会使用的解决方案

import json
import typing
from pydantic import BaseModel

# if you look closely, this just represents those tiny dictionaries in your list
class Anticipation(BaseModel):
    time: str
    points: float
    top_properties: typing.Dict[str, float]
    status: int
    alert: bool

# this is the whole thing, note how we say that anticipations is a list of those objects we defined above
class Data(BaseModel):
    id: str
    name: str
    cake_name: "str"
    metric: float
    anticipations: typing.List[Anticipation]

json_data = """{
  "id": null,
  "name": "name",
  "cake_name": "test",
  "metric": 0.5,
  "anticipations": [
    {
      "time": "2018-01-01 00:00:00",
      "points": 0.49128797804879504,
      "top_properties": {
        "LA:TB2341": 0.23,
        "LA:TB2342": 0.23,
        "LA:TB2343": 0.23
      },
      "status": 0,
      "alert": false
    },
    {
      "time": "2018-01-02 00:00:00",
      "points": 0.588751186433263,
      "top_properties": {
        "LA:TB2342": 0.23,
        "LA:TB2341": 0.23,
        "LA:TB2344": 0.23
      },
      "status": null,
      "alert": true
    }
  ]
}
"""
data = json.loads(json_data)
data = Data(**data)

我把id改成了null,把status改成了null。如果你 运行 这个,它会失败并显示这条消息。这是相当有用的

pydantic.error_wrappers.ValidationError: 2 validation errors
id
  none is not an allowed value (type=type_error.none.not_allowed)
anticipations -> 1 -> status
  value is not a valid integer (type=type_error.integer)

显然这意味着您必须安装第 3 方软件包,对于新的 python 编码人员,人们会建议不要这样做。在这种情况下,下面的模板应该会为您指明正确的方向

def validate(my_dict: typing.Dict, string_attributes, int_attributes, float_attributes):
    validations_errors = []
    for attribute in string_attributes:
        if attribute in my_dict:
            if attribute in string_attributes and not isinstance(my_dict.get(attribute), str):
                validations_errors.append(f"key {attribute} is not a string, got {my_dict.get(attribute)}")
            if attribute in int_attributes and not isinstance(my_dict.get(attribute), int):
                # append to the list of errors
                pass
    return validations_errors


def custom_validator(json_data: typing.Dict):
    string_attributes = ["id", "name", "cake_name", "time", "LA:TB2342", "LA:TB2341", "LA:TB2344"]
    int_attributes = [...]
    float_attributes = [...]

    # now do it for anticipations
    validation_errors = validate(json_data, string_attributes, int_attributes, float_attributes)
    for i, anticipation in enumerate(json_data.get('anticipations')):
        validation_error = validate(anticipation, string_attributes, int_attributes, float_attributes)
        if validation_error:
            validation_errors.append(f"anticipation -> {i} error: {validation_error}")

    return validation_errors
data = json.loads(json_data)
custom_validator(data)

输出:['key id is not a string, got None']

您可以在该函数的基础上进行构建