Python/Pydantic - 使用包含 json 个对象的列表
Python/Pydantic - using a list with json objects
我有一个工作模型可以使用 pydantic
接收 json
数据集。模型数据集如下所示:
data = {'thing_number': 123,
'thing_description': 'duck',
'thing_amount': 4.56}
我想做的是将 json
个文件列表作为数据集并能够验证它们。最终该列表将转换为 pandas
中的记录以供进一步处理。我的目标是验证一个任意长的 json
条目列表,看起来像这样:
bigger_data = [{'thing_number': 123,
'thing_description': 'duck',
'thing_amount': 4.56},
{'thing_number': 456,
'thing_description': 'cow',
'thing_amount': 7.89}]
我现在的基本设置如下。请注意,添加 class ItemList
是尝试使任意长度起作用的一部分。
from typing import List
from pydantic import BaseModel
from pydantic.schema import schema
import json
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
each_item: List[Item]
然后基本代码将生成我认为我在数组对象中寻找的内容,该数组对象将包含 Item
个对象。
item_schema = schema([ItemList])
print(json.dumps(item_schema, indent=2))
{
"definitions": {
"Item": {
"title": "Item",
"type": "object",
"properties": {
"thing_number": {
"title": "Thing_Number",
"type": "integer"
},
"thing_description": {
"title": "Thing_Description",
"type": "string"
},
"thing_amount": {
"title": "Thing_Amount",
"type": "number"
}
},
"required": [
"thing_number",
"thing_description",
"thing_amount"
]
},
"ItemList": {
"title": "ItemList",
"type": "object",
"properties": {
"each_item": {
"title": "Each_Item",
"type": "array",
"items": {
"$ref": "#/definitions/Item"
}
}
},
"required": [
"each_item"
]
}
}
}
该设置适用于正在传递的单个 json 项:
item = Item(**data)
print(item)
Item thing_number=123 thing_description='duck' thing_amount=4.56
但是当我尝试将单个项目传递到 ItemList
模型时,它 returns 出现错误:
item_list = ItemList(**data)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
<ipython-input-94-48efd56e7b6c> in <module>
----> 1 item_list = ItemList(**data)
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()
ValidationError: 1 validation error for ItemList
each_item
field required (type=value_error.missing)
我也试过将 bigger_data
传递到数组中,认为它需要作为列表开始。那也是 returns 一个错误 - - 虽然,我至少对字典错误有了更好的理解,但我不知道如何解决。
item_list2 = ItemList(**data_big)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-100-8fe9a5414bd6> in <module>
----> 1 item_list2 = ItemList(**data_big)
TypeError: MetaModel object argument after ** must be a mapping, not list
谢谢。
我尝试过的其他东西
我试过将数据传递到特定的密钥中,运气好一点(也许吧?)。
item_list2 = ItemList(each_item=data_big)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
<ipython-input-111-07e5c12bf8b4> in <module>
----> 1 item_list2 = ItemList(each_item=data_big)
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()
ValidationError: 6 validation errors for ItemList
each_item -> 0 -> thing_number
field required (type=value_error.missing)
each_item -> 0 -> thing_description
field required (type=value_error.missing)
each_item -> 0 -> thing_amount
field required (type=value_error.missing)
each_item -> 1 -> thing_number
field required (type=value_error.missing)
each_item -> 1 -> thing_description
field required (type=value_error.missing)
each_item -> 1 -> thing_amount
field required (type=value_error.missing)
from typing import List
from pydantic import BaseModel
import json
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
each_item: List[Item]
根据您的代码,将 each_item 作为项目列表
a_duck = Item(thing_number=123, thing_description="duck", thing_amount=4.56)
print(a_duck.json())
a_list = ItemList(each_item=[a_duck])
print(a_list.json())
生成以下输出:
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
{"each_item": [{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}]}
将这些用作 "entry json":
a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
a_json_list = {
"each_item": [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
]
}
print(Item(**a_json_duck))
print(ItemList(**a_json_list))
工作正常并生成:
Item thing_number=123 thing_description='duck' thing_amount=4.56
ItemList each_item=[<Item thing_number=123 thing_description='duck' thing_amount=4.56>]
我们只剩下唯一的数据:
just_datas = [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
]
item_list = ItemList(each_item=just_datas)
print(item_list)
print(type(item_list.each_item[1]))
print(item_list.each_item[1])
那些按预期工作:
ItemList each_item=[<Item thing_number=123 thing_description='duck'thing_amount=4.56>,<Item thin…
<class '__main__.Item'>
Item thing_number=456 thing_description='cow' thing_amount=7.89
所以如果我遗漏了一些东西,pydantic librairy 会按预期工作。
我的 pydantic 版本:0.30 python 3.7.4
正在从一个相似的文件中读取:
json_data_file = """[
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89}]"""
from io import StringIO
item_list2 = ItemList(each_item=json.load(StringIO(json_data_file)))
工作也很好。
要避免在 ItemList
中出现 "each_item"
,您可以使用 __root__
Pydantic 关键字:
from typing import List
from pydantic import BaseModel
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
__root__: List[Item] # ⯇-- __root__
构建 item_list
:
just_data = [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
]
item_list = ItemList(__root__=just_data)
a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
item_list.__root__.append(a_json_duck)
支持 Pydantic 的网络框架经常将 ItemList
jsonify 为 JSON 数组,没有中间 __root__
关键字。
以下也适用,不需要根类型。
从 List[dict]
转换为 List[Item]
:
items = parse_obj_as(List[Item], bigger_data)
从 JSON str
转换为 List[Item]
:
items = parse_raw_as(List[Item], bigger_data_json)
从 List[Item]
转换为 JSON str
:
bigger_data_json = json.dumps(items, default=pydantic_encoder)
或使用自定义编码器:
def custom_encoder(**kwargs):
def base_encoder(obj):
if isinstance(obj, BaseModel):
return obj.dict(**kwargs)
else:
return pydantic_encoder(obj)
return base_encoder
bigger_data_json = json.dumps(items, default=custom_encoder(by_alias=True))
我有一个工作模型可以使用 pydantic
接收 json
数据集。模型数据集如下所示:
data = {'thing_number': 123,
'thing_description': 'duck',
'thing_amount': 4.56}
我想做的是将 json
个文件列表作为数据集并能够验证它们。最终该列表将转换为 pandas
中的记录以供进一步处理。我的目标是验证一个任意长的 json
条目列表,看起来像这样:
bigger_data = [{'thing_number': 123,
'thing_description': 'duck',
'thing_amount': 4.56},
{'thing_number': 456,
'thing_description': 'cow',
'thing_amount': 7.89}]
我现在的基本设置如下。请注意,添加 class ItemList
是尝试使任意长度起作用的一部分。
from typing import List
from pydantic import BaseModel
from pydantic.schema import schema
import json
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
each_item: List[Item]
然后基本代码将生成我认为我在数组对象中寻找的内容,该数组对象将包含 Item
个对象。
item_schema = schema([ItemList])
print(json.dumps(item_schema, indent=2))
{
"definitions": {
"Item": {
"title": "Item",
"type": "object",
"properties": {
"thing_number": {
"title": "Thing_Number",
"type": "integer"
},
"thing_description": {
"title": "Thing_Description",
"type": "string"
},
"thing_amount": {
"title": "Thing_Amount",
"type": "number"
}
},
"required": [
"thing_number",
"thing_description",
"thing_amount"
]
},
"ItemList": {
"title": "ItemList",
"type": "object",
"properties": {
"each_item": {
"title": "Each_Item",
"type": "array",
"items": {
"$ref": "#/definitions/Item"
}
}
},
"required": [
"each_item"
]
}
}
}
该设置适用于正在传递的单个 json 项:
item = Item(**data)
print(item)
Item thing_number=123 thing_description='duck' thing_amount=4.56
但是当我尝试将单个项目传递到 ItemList
模型时,它 returns 出现错误:
item_list = ItemList(**data)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
<ipython-input-94-48efd56e7b6c> in <module>
----> 1 item_list = ItemList(**data)
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()
ValidationError: 1 validation error for ItemList
each_item
field required (type=value_error.missing)
我也试过将 bigger_data
传递到数组中,认为它需要作为列表开始。那也是 returns 一个错误 - - 虽然,我至少对字典错误有了更好的理解,但我不知道如何解决。
item_list2 = ItemList(**data_big)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-100-8fe9a5414bd6> in <module>
----> 1 item_list2 = ItemList(**data_big)
TypeError: MetaModel object argument after ** must be a mapping, not list
谢谢。
我尝试过的其他东西
我试过将数据传递到特定的密钥中,运气好一点(也许吧?)。
item_list2 = ItemList(each_item=data_big)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
<ipython-input-111-07e5c12bf8b4> in <module>
----> 1 item_list2 = ItemList(each_item=data_big)
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()
/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()
ValidationError: 6 validation errors for ItemList
each_item -> 0 -> thing_number
field required (type=value_error.missing)
each_item -> 0 -> thing_description
field required (type=value_error.missing)
each_item -> 0 -> thing_amount
field required (type=value_error.missing)
each_item -> 1 -> thing_number
field required (type=value_error.missing)
each_item -> 1 -> thing_description
field required (type=value_error.missing)
each_item -> 1 -> thing_amount
field required (type=value_error.missing)
from typing import List
from pydantic import BaseModel
import json
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
each_item: List[Item]
根据您的代码,将 each_item 作为项目列表
a_duck = Item(thing_number=123, thing_description="duck", thing_amount=4.56)
print(a_duck.json())
a_list = ItemList(each_item=[a_duck])
print(a_list.json())
生成以下输出:
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
{"each_item": [{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}]}
将这些用作 "entry json":
a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
a_json_list = {
"each_item": [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
]
}
print(Item(**a_json_duck))
print(ItemList(**a_json_list))
工作正常并生成:
Item thing_number=123 thing_description='duck' thing_amount=4.56
ItemList each_item=[<Item thing_number=123 thing_description='duck' thing_amount=4.56>]
我们只剩下唯一的数据:
just_datas = [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
]
item_list = ItemList(each_item=just_datas)
print(item_list)
print(type(item_list.each_item[1]))
print(item_list.each_item[1])
那些按预期工作:
ItemList each_item=[<Item thing_number=123 thing_description='duck'thing_amount=4.56>,<Item thin…
<class '__main__.Item'>
Item thing_number=456 thing_description='cow' thing_amount=7.89
所以如果我遗漏了一些东西,pydantic librairy 会按预期工作。
我的 pydantic 版本:0.30 python 3.7.4
正在从一个相似的文件中读取:
json_data_file = """[
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89}]"""
from io import StringIO
item_list2 = ItemList(each_item=json.load(StringIO(json_data_file)))
工作也很好。
要避免在 ItemList
中出现 "each_item"
,您可以使用 __root__
Pydantic 关键字:
from typing import List
from pydantic import BaseModel
class Item(BaseModel):
thing_number: int
thing_description: str
thing_amount: float
class ItemList(BaseModel):
__root__: List[Item] # ⯇-- __root__
构建 item_list
:
just_data = [
{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
{"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
]
item_list = ItemList(__root__=just_data)
a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
item_list.__root__.append(a_json_duck)
支持 Pydantic 的网络框架经常将 ItemList
jsonify 为 JSON 数组,没有中间 __root__
关键字。
以下也适用,不需要根类型。
从 List[dict]
转换为 List[Item]
:
items = parse_obj_as(List[Item], bigger_data)
从 JSON str
转换为 List[Item]
:
items = parse_raw_as(List[Item], bigger_data_json)
从 List[Item]
转换为 JSON str
:
bigger_data_json = json.dumps(items, default=pydantic_encoder)
或使用自定义编码器:
def custom_encoder(**kwargs):
def base_encoder(obj):
if isinstance(obj, BaseModel):
return obj.dict(**kwargs)
else:
return pydantic_encoder(obj)
return base_encoder
bigger_data_json = json.dumps(items, default=custom_encoder(by_alias=True))