在 python 中替换 JSON 个关键值和分解值
Replace JSON Key Values and Break up values in python
我有一组如下所示的 ndJOSN 数据集:
{'ADDRESS_CITY': 'Whittier', 'ADDRESS_LINE_1': '905 Greenleaf Avenue', 'ADDRESS_STATE': 'CA', 'ADDRESS_ZIP': '90402',},
{'ADDRESS_CITY': 'Cedar Falls', 'ADDRESS_LINE_1': '93323 Maplewood Dr', 'ADDRESS_STATE': 'CA', 'ADDRESS_ZIP': '95014'}
我需要将上面的值传递到 api 请求中,特别是以下格式的正文。
data=[
{
"addressee":"Greenleaf Avenue",
"street":"905 Greenleaf Avenue",
"city":"Whittier",
"state":"CA",
"zipcode":"90402",
},
{
"addressee":"93323",
"street":"Maplewood Dr",
"city":"Cedar Falls",
"state":"CA",
"zipcode":"95014",
}
]
如您所见,密钥不同,因此我需要更改密钥以与正确的数据对齐,并将它们与新的密钥名称一起传递(即 address_line_1 转到收件人)- 并且此请求中将包含 10k 个地址。
我在第一个示例中没有注意到它,但每个地址都有一个 ID - 我必须删除才能发出请求,然后重新添加。
所以我最终解决了下面的问题 - 任何更多的 pythonic,这些对我来说感觉不是那么 eloquent...?
addresses = ndjson.loads(addresses)
data = json.loads(json.dumps(addresses).replace('"ADDRESS_CITY"','"city"').replace('"ADDRESS_LINE_1"','"street"').replace('"ADDRESS_STATE"','"state"').replace('"ADDRESS_ZIP"','"zipcode"'))
ids = []
for i in data:
i['candidates'] = 1
ids.append(i["ID"])
del i["ID"]
response = requests.request("POST", url, json=data)
resp_data = response.json()
a = 0
for i in resp_data:
i['ID'] = ids[a]
x = i['ID'] = ids[a]
a = a + 1
用字典翻译它们:
translations = {
"ADDRESS_CITY": "city"} # etc
input_data = ... # your data here
data = [{translations[k]: v for k, v in row.items()} for row in input_data]
如果您想让事情变得更轻松一些,我建议您使用 data classes 对您的输入数据建模。这样做的主要好处是您可以使用点 .
访问属性,并且您不需要使用具有动态键的字典。您还可以从类型提示中受益,因此您的 IDE 也应该能够更好地帮助您。
在这种情况下,我建议将它与 JSON 序列化库配对,例如 dataclass-wizard,它实际上完美地支持这个用例。从最新版本 - v0.15.0 开始,它应该还支持从序列化/转储过程中排除字段。
这是我放在一起的一个简单示例,它使用了上面所需的键映射:
import json
from dataclasses import dataclass, field
# note: for python 3.9+, you can import this from `typing` instead
from typing_extensions import Annotated
from dataclass_wizard import JSONWizard, json_key
@dataclass
class AddressInfo(JSONWizard):
"""
AddressInfo dataclass
"""
city: Annotated[str, json_key('ADDRESS_CITY')]
street: Annotated[str, json_key('ADDRESS_LINE_1')]
state: Annotated[str, json_key('ADDRESS_STATE')]
# pass `dump=False`, so we exclude the field in serialization.
id: Annotated[int, json_key('ID', dump=False)]
# you could also annotate the below like `Union[str, int]`
# if you want to retain it as a string.
zipcode: Annotated[int, json_key('ADDRESS_ZIP')]
# exclude this field from the constructor (and from the
# de-serialization process)
candidates: int = field(default=1, init=False)
上面的示例用法:
input_obj = [{'ADDRESS_CITY': 'Whittier', 'ADDRESS_LINE_1': '905 Greenleaf Avenue',
'ADDRESS_STATE': 'CA', 'ADDRESS_ZIP': '90402',
'ID': 111},
{'ADDRESS_CITY': 'Cedar Falls', 'ADDRESS_LINE_1': '93323 Maplewood Dr',
'ADDRESS_STATE': 'CA', 'ADDRESS_ZIP': '95014',
'ID': 222}]
addresses = AddressInfo.from_list(input_obj)
print('-- Addresses')
for a in addresses:
print(repr(a))
out_list = [a.to_dict() for a in addresses]
print('-- To JSON')
print(json.dumps(out_list, indent=2))
# alternatively, with the latest version (0.15.1)
# print(AddressInfo.list_to_json(addresses, indent=2))
注意:您仍然可以正常访问每个地址的 id
,即使 JSON 结果中省略了该字段。
我有一组如下所示的 ndJOSN 数据集:
{'ADDRESS_CITY': 'Whittier', 'ADDRESS_LINE_1': '905 Greenleaf Avenue', 'ADDRESS_STATE': 'CA', 'ADDRESS_ZIP': '90402',},
{'ADDRESS_CITY': 'Cedar Falls', 'ADDRESS_LINE_1': '93323 Maplewood Dr', 'ADDRESS_STATE': 'CA', 'ADDRESS_ZIP': '95014'}
我需要将上面的值传递到 api 请求中,特别是以下格式的正文。
data=[
{
"addressee":"Greenleaf Avenue",
"street":"905 Greenleaf Avenue",
"city":"Whittier",
"state":"CA",
"zipcode":"90402",
},
{
"addressee":"93323",
"street":"Maplewood Dr",
"city":"Cedar Falls",
"state":"CA",
"zipcode":"95014",
}
]
如您所见,密钥不同,因此我需要更改密钥以与正确的数据对齐,并将它们与新的密钥名称一起传递(即 address_line_1 转到收件人)- 并且此请求中将包含 10k 个地址。
我在第一个示例中没有注意到它,但每个地址都有一个 ID - 我必须删除才能发出请求,然后重新添加。 所以我最终解决了下面的问题 - 任何更多的 pythonic,这些对我来说感觉不是那么 eloquent...?
addresses = ndjson.loads(addresses)
data = json.loads(json.dumps(addresses).replace('"ADDRESS_CITY"','"city"').replace('"ADDRESS_LINE_1"','"street"').replace('"ADDRESS_STATE"','"state"').replace('"ADDRESS_ZIP"','"zipcode"'))
ids = []
for i in data:
i['candidates'] = 1
ids.append(i["ID"])
del i["ID"]
response = requests.request("POST", url, json=data)
resp_data = response.json()
a = 0
for i in resp_data:
i['ID'] = ids[a]
x = i['ID'] = ids[a]
a = a + 1
用字典翻译它们:
translations = {
"ADDRESS_CITY": "city"} # etc
input_data = ... # your data here
data = [{translations[k]: v for k, v in row.items()} for row in input_data]
如果您想让事情变得更轻松一些,我建议您使用 data classes 对您的输入数据建模。这样做的主要好处是您可以使用点 .
访问属性,并且您不需要使用具有动态键的字典。您还可以从类型提示中受益,因此您的 IDE 也应该能够更好地帮助您。
在这种情况下,我建议将它与 JSON 序列化库配对,例如 dataclass-wizard,它实际上完美地支持这个用例。从最新版本 - v0.15.0 开始,它应该还支持从序列化/转储过程中排除字段。
这是我放在一起的一个简单示例,它使用了上面所需的键映射:
import json
from dataclasses import dataclass, field
# note: for python 3.9+, you can import this from `typing` instead
from typing_extensions import Annotated
from dataclass_wizard import JSONWizard, json_key
@dataclass
class AddressInfo(JSONWizard):
"""
AddressInfo dataclass
"""
city: Annotated[str, json_key('ADDRESS_CITY')]
street: Annotated[str, json_key('ADDRESS_LINE_1')]
state: Annotated[str, json_key('ADDRESS_STATE')]
# pass `dump=False`, so we exclude the field in serialization.
id: Annotated[int, json_key('ID', dump=False)]
# you could also annotate the below like `Union[str, int]`
# if you want to retain it as a string.
zipcode: Annotated[int, json_key('ADDRESS_ZIP')]
# exclude this field from the constructor (and from the
# de-serialization process)
candidates: int = field(default=1, init=False)
上面的示例用法:
input_obj = [{'ADDRESS_CITY': 'Whittier', 'ADDRESS_LINE_1': '905 Greenleaf Avenue',
'ADDRESS_STATE': 'CA', 'ADDRESS_ZIP': '90402',
'ID': 111},
{'ADDRESS_CITY': 'Cedar Falls', 'ADDRESS_LINE_1': '93323 Maplewood Dr',
'ADDRESS_STATE': 'CA', 'ADDRESS_ZIP': '95014',
'ID': 222}]
addresses = AddressInfo.from_list(input_obj)
print('-- Addresses')
for a in addresses:
print(repr(a))
out_list = [a.to_dict() for a in addresses]
print('-- To JSON')
print(json.dumps(out_list, indent=2))
# alternatively, with the latest version (0.15.1)
# print(AddressInfo.list_to_json(addresses, indent=2))
注意:您仍然可以正常访问每个地址的 id
,即使 JSON 结果中省略了该字段。