python -- pandas 数据框到具有层次结构级别的嵌套 json
python -- pandas dataframe to nested json with hierarchy level
为了能够生成复选框,我需要将 pandas DataFrame 转换为 JSON 格式。
首先,我有一个 pandas 数据框:
cast
title
type
Daniel Craig
Sky Fall
Movie
Ahmed Bakare
Bad Habits
Music Video
Leonardo Dicaprio
Titanic
Movie
Judi Dench
Sky Fall
Movie
Kate Winslet
Titanic
Movie
Emily Ratajkowski
Blurred Lines
Music Video
Elle Evans
Blurred Lines
Music Video
我想像下面的格式一样转换它:
{
"Movie": {
"label": "Movie",
"children": {
"Sky Fall": {
"label": "Sky Fall",
"children": {
"Daniel Craig": {
"label": "Daniel Craig"
},
"Judi Dench": {
"label": "Judi Dench"
}
}
},
"Titanic": {
"label": "Titanic",
"children": {
"Leonardo Dicaprio": {
"label": "Leonardo Dicaprio"
},
"Kate Winslet": {
"label": "Kate Winslet"
}
}
}
}
},
"Music Video": {
"label": "Music Video",
"children": {
"Bad Habits": {
"label": "Bad Habits",
"children": {
"Ahmed Bakare": {
"label": "Ahmed Bakare"
}
}
},
"Blurred Lines": {
"label": "Blurred Lines",
"children": {
"Emily Ratajkowski": {
"label": "Emily Ratajkowski"
},
"Elle Evans": {
"label": "Elle Evans"
}
}
}
}
}
};
我目前的做法是:
menu = []
groupDict = df.groupby('type').apply(lambda g: g.drop('type', axis=1).to_dict(orient='records')).to_dict()
for key, value in groupDict.items():
menu.append(dict(type=key,children=str(value)))
然而,结果并没有如我所愿
[{'type': 'Movie',
'children': "[{'cast': 'Daniel Craig', 'title': 'Sky Fall'}, {'cast': 'Leonardo Dicaprio', 'title': 'Titanic'}, {'cast': 'Judi Dench', 'title': 'Sky Fall'}, {'cast': 'Kate Winslet', 'title': 'Titanic'}]"},
{'type': 'Music Video',
'children': "[{'cast': 'Ahmed Bakare', 'title': 'Bad Habits'}, {'cast': 'Emily Ratajkowski', 'title': 'Blurred Lines'}, {'cast': 'Elle Evans', 'title': 'Blurred Lines'}]"}]
我对这个 post 做了些许修改:
它需要 csv 输入,因此您可以先轻松地将数据帧转换为 csv
- test.csv
type,title ,cast
Movie,Sky Fall ,Daniel Craig
Music Video,Bad Habits ,Ahmed Bakare
Movie,Titanic ,Leonardo Dicaprio
Movie,Sky Fall ,Judi Dench
Movie,Titanic ,Kate Winslet
Music Video,Blurred Lines ,Emily Ratajkowski
Music Video,Blurred Lines ,Elle Evans
- test.py
import csv
from collections import defaultdict
def ctree():
""" One of the python gems. Making possible to have dynamic tree structure.
"""
return defaultdict(ctree)
def build_leaf(name, leaf):
""" Recursive function to build desired custom tree structure
"""
res = {"label": name}
# add children node if the leaf actually has any children
if len(leaf.keys()) > 0:
res["children"] = {k:build_leaf(k, v) for k, v in leaf.items()}
return res
def main():
""" The main thread composed from two parts.
First it's parsing the csv file and builds a tree hierarchy from it.
Second it's recursively iterating over the tree and building custom
json-like structure (via dict).
And the last part is just printing the result.
"""
tree = ctree()
# NOTE: you need to have test.csv file as neighbor to this file
with open('test.csv') as csvfile:
reader = csv.reader(csvfile)
for rid, row in enumerate(reader):
# skipping first header row. remove this logic if your csv is
# headerless
if rid == 0:
continue
# usage of python magic to construct dynamic tree structure and
# basically grouping csv values under their parents
leaf = tree[row[0]]
for cid in range(1, len(row)):
leaf = leaf[row[cid]]
# building a custom tree structure
res = {}
for name, leaf in tree.items():
res[name] = build_leaf(name, leaf)
# printing results into the terminal
import json
print(json.dumps(res, indent=4, sort_keys=True))
# so let's roll
main()
- 执行测试
python test.py
- 结果
{
"Movie": {
"children": {
"Sky Fall ": {
"children": {
"Daniel Craig ": {
"label": "Daniel Craig "
},
"Judi Dench ": {
"label": "Judi Dench "
}
},
"label": "Sky Fall "
},
"Titanic ": {
"children": {
"Kate Winslet ": {
"label": "Kate Winslet "
},
"Leonardo Dicaprio ": {
"label": "Leonardo Dicaprio "
}
},
"label": "Titanic "
}
},
"label": "Movie"
},
"Music Video": {
"children": {
"Bad Habits ": {
"children": {
"Ahmed Bakare ": {
"label": "Ahmed Bakare "
}
},
"label": "Bad Habits "
},
"Blurred Lines ": {
"children": {
"Elle Evans": {
"label": "Elle Evans"
},
"Emily Ratajkowski ": {
"label": "Emily Ratajkowski "
}
},
"label": "Blurred Lines "
}
},
"label": "Music Video"
}
}
为了能够生成复选框,我需要将 pandas DataFrame 转换为 JSON 格式。
首先,我有一个 pandas 数据框:
cast | title | type |
---|---|---|
Daniel Craig | Sky Fall | Movie |
Ahmed Bakare | Bad Habits | Music Video |
Leonardo Dicaprio | Titanic | Movie |
Judi Dench | Sky Fall | Movie |
Kate Winslet | Titanic | Movie |
Emily Ratajkowski | Blurred Lines | Music Video |
Elle Evans | Blurred Lines | Music Video |
我想像下面的格式一样转换它:
{
"Movie": {
"label": "Movie",
"children": {
"Sky Fall": {
"label": "Sky Fall",
"children": {
"Daniel Craig": {
"label": "Daniel Craig"
},
"Judi Dench": {
"label": "Judi Dench"
}
}
},
"Titanic": {
"label": "Titanic",
"children": {
"Leonardo Dicaprio": {
"label": "Leonardo Dicaprio"
},
"Kate Winslet": {
"label": "Kate Winslet"
}
}
}
}
},
"Music Video": {
"label": "Music Video",
"children": {
"Bad Habits": {
"label": "Bad Habits",
"children": {
"Ahmed Bakare": {
"label": "Ahmed Bakare"
}
}
},
"Blurred Lines": {
"label": "Blurred Lines",
"children": {
"Emily Ratajkowski": {
"label": "Emily Ratajkowski"
},
"Elle Evans": {
"label": "Elle Evans"
}
}
}
}
}
};
我目前的做法是:
menu = []
groupDict = df.groupby('type').apply(lambda g: g.drop('type', axis=1).to_dict(orient='records')).to_dict()
for key, value in groupDict.items():
menu.append(dict(type=key,children=str(value)))
然而,结果并没有如我所愿
[{'type': 'Movie',
'children': "[{'cast': 'Daniel Craig', 'title': 'Sky Fall'}, {'cast': 'Leonardo Dicaprio', 'title': 'Titanic'}, {'cast': 'Judi Dench', 'title': 'Sky Fall'}, {'cast': 'Kate Winslet', 'title': 'Titanic'}]"},
{'type': 'Music Video',
'children': "[{'cast': 'Ahmed Bakare', 'title': 'Bad Habits'}, {'cast': 'Emily Ratajkowski', 'title': 'Blurred Lines'}, {'cast': 'Elle Evans', 'title': 'Blurred Lines'}]"}]
我对这个 post 做了些许修改:
它需要 csv 输入,因此您可以先轻松地将数据帧转换为 csv
- test.csv
type,title ,cast
Movie,Sky Fall ,Daniel Craig
Music Video,Bad Habits ,Ahmed Bakare
Movie,Titanic ,Leonardo Dicaprio
Movie,Sky Fall ,Judi Dench
Movie,Titanic ,Kate Winslet
Music Video,Blurred Lines ,Emily Ratajkowski
Music Video,Blurred Lines ,Elle Evans
- test.py
import csv
from collections import defaultdict
def ctree():
""" One of the python gems. Making possible to have dynamic tree structure.
"""
return defaultdict(ctree)
def build_leaf(name, leaf):
""" Recursive function to build desired custom tree structure
"""
res = {"label": name}
# add children node if the leaf actually has any children
if len(leaf.keys()) > 0:
res["children"] = {k:build_leaf(k, v) for k, v in leaf.items()}
return res
def main():
""" The main thread composed from two parts.
First it's parsing the csv file and builds a tree hierarchy from it.
Second it's recursively iterating over the tree and building custom
json-like structure (via dict).
And the last part is just printing the result.
"""
tree = ctree()
# NOTE: you need to have test.csv file as neighbor to this file
with open('test.csv') as csvfile:
reader = csv.reader(csvfile)
for rid, row in enumerate(reader):
# skipping first header row. remove this logic if your csv is
# headerless
if rid == 0:
continue
# usage of python magic to construct dynamic tree structure and
# basically grouping csv values under their parents
leaf = tree[row[0]]
for cid in range(1, len(row)):
leaf = leaf[row[cid]]
# building a custom tree structure
res = {}
for name, leaf in tree.items():
res[name] = build_leaf(name, leaf)
# printing results into the terminal
import json
print(json.dumps(res, indent=4, sort_keys=True))
# so let's roll
main()
- 执行测试
python test.py
- 结果
{
"Movie": {
"children": {
"Sky Fall ": {
"children": {
"Daniel Craig ": {
"label": "Daniel Craig "
},
"Judi Dench ": {
"label": "Judi Dench "
}
},
"label": "Sky Fall "
},
"Titanic ": {
"children": {
"Kate Winslet ": {
"label": "Kate Winslet "
},
"Leonardo Dicaprio ": {
"label": "Leonardo Dicaprio "
}
},
"label": "Titanic "
}
},
"label": "Movie"
},
"Music Video": {
"children": {
"Bad Habits ": {
"children": {
"Ahmed Bakare ": {
"label": "Ahmed Bakare "
}
},
"label": "Bad Habits "
},
"Blurred Lines ": {
"children": {
"Elle Evans": {
"label": "Elle Evans"
},
"Emily Ratajkowski ": {
"label": "Emily Ratajkowski "
}
},
"label": "Blurred Lines "
}
},
"label": "Music Video"
}
}