使用 Python 将分隔字符串和值转换为分层 JSON
Convert delimited string and value into hierarchical JSON with Python
我有以下格式的数据:
[['Director', 9010],
['Director - Product Manager', 9894],
['Director - Product Manager - Project Manager', 9080],
['Director - Product Manager - Project Manager - Staff', 5090],
['Director - Product Manager - Project Manager - Staff 2', 5087],
['Director - Product Manager - Project Manager 2', 9099],...]
并希望得到如下所示的输出:
{
'title': 'Director',
'id': 9010,
'children': [
{
'title': 'Product Manager',
'id': 9894,
'children': [
{
'title': 'Project Manager',
'id': 9080,
'children': [
...
]
},{
'title': 'Project Manager 2',
'id': 9099,
'children': [
...
]
}],
...
]
},
...
]
}
我一直在摆弄字典,但很难将 ID 与标题相匹配。任何建议表示赞赏。
一个有效的方法是让最外层成为列表而不是字典。当我们遍历标题字符串中的每个标题时,我们会在当前列表中查找具有该标题的字典。如果当前列表中不存在标题,那么我们需要创建一个新的字典。如果确实存在,那么我们将该字典的 children 列表作为新的当前列表,然后继续寻找下一个标题。
我还编写了一个递归 prune
函数,它会在我们输入完所有数据后删除所有空的 children 列表,以防您想要这样做。
import json
lst = [
['Director', 9010],
['Director - Product Manager', 9894],
['Director - Product Manager - Project Manager', 9080],
['Director - Product Manager - Project Manager - Staff', 5090],
['Director - Product Manager - Project Manager - Staff 2', 5087],
['Director - Product Manager - Project Manager 2', 9099],
]
# Search for a matching name in the current list.
# If it doesn't exist, create it.
def insert(lst, name, idnum):
for d in lst:
if d['title'] == name:
break
else:
d = {'title': name, 'id': idnum, 'children': []}
lst.append(d)
return d['children']
# Remove empty child lists
def prune(lst):
for d in lst:
if d['children']:
prune(d['children'])
else:
del d['children']
# Insert the data into the master list
master = []
for names, idnum in lst:
lst = master
for name in [s.strip() for s in names.split(' - ')]:
lst = insert(lst, name, idnum)
prune(master)
# Get the top level dict from the master list
data = master[0]
print(json.dumps(data, indent=4))
输出
{
"title": "Director",
"id": 9010,
"children": [
{
"title": "Product Manager",
"id": 9894,
"children": [
{
"title": "Project Manager",
"id": 9080,
"children": [
{
"title": "Staff",
"id": 5090
},
{
"title": "Staff 2",
"id": 5087
}
]
},
{
"title": "Project Manager 2",
"id": 9099
}
]
}
]
}
将 d
作为您的输入,使用以下方式遍历您的输入列表。
由于每个子列表中都有两个元素,因此将迭代变量中的位置和id保存为p
和id
例如,您正在处理列表 ['Director - Product Manager - Project Manager - Staff', 5090],
要获得每个位置的标题,您可以将您的位置拆分为 -
并去除前导和尾随的空格。例如,
>>> d[3][0]
'Director - Product Manager - Project Manager - Staff'
>>> map(str.strip,d[3][0].split('-'))
['Director', 'Product Manager', 'Project Manager', 'Staff']
输出字典连同 Staff
的前一个位置被传递给递归搜索方法,它获取查找值的所有匹配项,即 Project Manager
和 returns 一个列表。获取最后一场比赛。
>>> recursive_search([data,],'Product Manager')[-1]
{'children': [{'children': [{'id': 5090, 'title': 'Staff'}, {'id': 5087, 'title': 'Staff 2'}], 'id': 9080, 'title': 'Project Manager'}, {'id': 9099, 'title': 'Project Manager 2'}], 'id': 9894, 'title': 'Product Manager'}
您需要将新的id附加到上述结果的children
键!
结合以上所有,
d=[['Director', 9010],['Director - Product Manager', 9894],['Director - Product Manager - Project Manager', 9080],['Director - Product Manager - Project Manager - Staff', 5090],['Director - Product Manager - Project Manager - Staff 2', 5087],['Director - Product Manager - Project Manager 2', 9099],]
from pprint import pprint
def recursive_search(items, key):
found = []
for item in items:
if isinstance(item, list):
found += recursive_search(item, key)
elif isinstance(item, dict):
if key in item.values():
found.append(item)
found += recursive_search(item.values(), key)
return found
data={}
for p,id in d:
desig = map(str.strip,p.split('-'))
if len(desig)>1:
res = recursive_search([data,],desig[-2])[-1]
if res:
res['children']=res.get('children',[])
res['children'].append({'id':id,'title':desig[-1]})
else:
data = {'id':id,'title':p}
pprint.pprint(data)
输出:
{'children': [{'children': [{'children': [{'id': 5090, 'title': 'Staff'},
{'id': 5087,
'title': 'Staff 2'}],
'id': 9080,
'title': 'Project Manager'},
{'id': 9099, 'title': 'Project Manager 2'}],
'id': 9894,
'title': 'Product Manager'}],
'id': 9010,
'title': 'Director'}
参考:此处使用的 recursive_search 函数是通过提到的字典搜索的略微修改版本 here
我有以下格式的数据:
[['Director', 9010],
['Director - Product Manager', 9894],
['Director - Product Manager - Project Manager', 9080],
['Director - Product Manager - Project Manager - Staff', 5090],
['Director - Product Manager - Project Manager - Staff 2', 5087],
['Director - Product Manager - Project Manager 2', 9099],...]
并希望得到如下所示的输出:
{
'title': 'Director',
'id': 9010,
'children': [
{
'title': 'Product Manager',
'id': 9894,
'children': [
{
'title': 'Project Manager',
'id': 9080,
'children': [
...
]
},{
'title': 'Project Manager 2',
'id': 9099,
'children': [
...
]
}],
...
]
},
...
]
}
我一直在摆弄字典,但很难将 ID 与标题相匹配。任何建议表示赞赏。
一个有效的方法是让最外层成为列表而不是字典。当我们遍历标题字符串中的每个标题时,我们会在当前列表中查找具有该标题的字典。如果当前列表中不存在标题,那么我们需要创建一个新的字典。如果确实存在,那么我们将该字典的 children 列表作为新的当前列表,然后继续寻找下一个标题。
我还编写了一个递归 prune
函数,它会在我们输入完所有数据后删除所有空的 children 列表,以防您想要这样做。
import json
lst = [
['Director', 9010],
['Director - Product Manager', 9894],
['Director - Product Manager - Project Manager', 9080],
['Director - Product Manager - Project Manager - Staff', 5090],
['Director - Product Manager - Project Manager - Staff 2', 5087],
['Director - Product Manager - Project Manager 2', 9099],
]
# Search for a matching name in the current list.
# If it doesn't exist, create it.
def insert(lst, name, idnum):
for d in lst:
if d['title'] == name:
break
else:
d = {'title': name, 'id': idnum, 'children': []}
lst.append(d)
return d['children']
# Remove empty child lists
def prune(lst):
for d in lst:
if d['children']:
prune(d['children'])
else:
del d['children']
# Insert the data into the master list
master = []
for names, idnum in lst:
lst = master
for name in [s.strip() for s in names.split(' - ')]:
lst = insert(lst, name, idnum)
prune(master)
# Get the top level dict from the master list
data = master[0]
print(json.dumps(data, indent=4))
输出
{
"title": "Director",
"id": 9010,
"children": [
{
"title": "Product Manager",
"id": 9894,
"children": [
{
"title": "Project Manager",
"id": 9080,
"children": [
{
"title": "Staff",
"id": 5090
},
{
"title": "Staff 2",
"id": 5087
}
]
},
{
"title": "Project Manager 2",
"id": 9099
}
]
}
]
}
将 d
作为您的输入,使用以下方式遍历您的输入列表。
由于每个子列表中都有两个元素,因此将迭代变量中的位置和id保存为p
和id
例如,您正在处理列表 ['Director - Product Manager - Project Manager - Staff', 5090],
要获得每个位置的标题,您可以将您的位置拆分为 -
并去除前导和尾随的空格。例如,
>>> d[3][0]
'Director - Product Manager - Project Manager - Staff'
>>> map(str.strip,d[3][0].split('-'))
['Director', 'Product Manager', 'Project Manager', 'Staff']
输出字典连同 Staff
的前一个位置被传递给递归搜索方法,它获取查找值的所有匹配项,即 Project Manager
和 returns 一个列表。获取最后一场比赛。
>>> recursive_search([data,],'Product Manager')[-1]
{'children': [{'children': [{'id': 5090, 'title': 'Staff'}, {'id': 5087, 'title': 'Staff 2'}], 'id': 9080, 'title': 'Project Manager'}, {'id': 9099, 'title': 'Project Manager 2'}], 'id': 9894, 'title': 'Product Manager'}
您需要将新的id附加到上述结果的children
键!
结合以上所有,
d=[['Director', 9010],['Director - Product Manager', 9894],['Director - Product Manager - Project Manager', 9080],['Director - Product Manager - Project Manager - Staff', 5090],['Director - Product Manager - Project Manager - Staff 2', 5087],['Director - Product Manager - Project Manager 2', 9099],]
from pprint import pprint
def recursive_search(items, key):
found = []
for item in items:
if isinstance(item, list):
found += recursive_search(item, key)
elif isinstance(item, dict):
if key in item.values():
found.append(item)
found += recursive_search(item.values(), key)
return found
data={}
for p,id in d:
desig = map(str.strip,p.split('-'))
if len(desig)>1:
res = recursive_search([data,],desig[-2])[-1]
if res:
res['children']=res.get('children',[])
res['children'].append({'id':id,'title':desig[-1]})
else:
data = {'id':id,'title':p}
pprint.pprint(data)
输出:
{'children': [{'children': [{'children': [{'id': 5090, 'title': 'Staff'},
{'id': 5087,
'title': 'Staff 2'}],
'id': 9080,
'title': 'Project Manager'},
{'id': 9099, 'title': 'Project Manager 2'}],
'id': 9894,
'title': 'Product Manager'}],
'id': 9010,
'title': 'Director'}
参考:此处使用的 recursive_search 函数是通过提到的字典搜索的略微修改版本 here