带有重复键的 libpostal 输出字符串(dict),我需要将字符串转换为 Dict
libpostal output string (dict) with duplicate keys and I need to convert string to Dict
我正在使用 libpostal
地址解析库作为 .exe
文件。我有一个脚本来读取终端的输出。输出将是 string
和 dict
格式,如下所示,
这是地址字符串
"531A UPPER CROSS STREETSINGAPORE HONG LIM COMPLEX 051531 S"
libpostal 终端输出是
'{\n "house_number": "531a",\n "road": "upper cross streetsingapore",\n "city": "hong",\n "house": "lim complex",\n "house_number": "051531 s"\n}'
我需要从这个字符串创建一个 Dict
,如果有重复的键,则将这些值附加到同一个键中。
预期输出Dict
{
"house_number": "531a 051531 s",
"road": "upper cross streetsingapore",
"city": "hong",
"house": "lim complex",
}
帮助将不胜感激
您可以使用 json.JSONDecoder
将 dict 文字解码为元组列表,使用 dict.setdefault
将值组合为列表,最后将所有项目加入字典值中:
string = '{\n "house_number": "531a",\n "road": "upper cross streetsingapore",\n "city": "hong",\n "house": "lim complex",\n "house_number": "051531 s"\n}'
from json import JSONDecoder
decoder = JSONDecoder(object_pairs_hook=lambda x: x).decode(string)
out = {}
for tpl in decoder:
out.setdefault(tpl[0],[]).append(tpl[1])
out = {k:' '.join(v) for k,v in out.items()}
输出:
{'house_number': '531a 051531 s',
'road': 'upper cross streetsingapore',
'city': 'hong',
'house': 'lim complex'}
我正在使用 libpostal
地址解析库作为 .exe
文件。我有一个脚本来读取终端的输出。输出将是 string
和 dict
格式,如下所示,
这是地址字符串
"531A UPPER CROSS STREETSINGAPORE HONG LIM COMPLEX 051531 S"
libpostal 终端输出是
'{\n "house_number": "531a",\n "road": "upper cross streetsingapore",\n "city": "hong",\n "house": "lim complex",\n "house_number": "051531 s"\n}'
我需要从这个字符串创建一个 Dict
,如果有重复的键,则将这些值附加到同一个键中。
预期输出Dict
{
"house_number": "531a 051531 s",
"road": "upper cross streetsingapore",
"city": "hong",
"house": "lim complex",
}
帮助将不胜感激
您可以使用 json.JSONDecoder
将 dict 文字解码为元组列表,使用 dict.setdefault
将值组合为列表,最后将所有项目加入字典值中:
string = '{\n "house_number": "531a",\n "road": "upper cross streetsingapore",\n "city": "hong",\n "house": "lim complex",\n "house_number": "051531 s"\n}'
from json import JSONDecoder
decoder = JSONDecoder(object_pairs_hook=lambda x: x).decode(string)
out = {}
for tpl in decoder:
out.setdefault(tpl[0],[]).append(tpl[1])
out = {k:' '.join(v) for k,v in out.items()}
输出:
{'house_number': '531a 051531 s',
'road': 'upper cross streetsingapore',
'city': 'hong',
'house': 'lim complex'}