从 pandas 数据框以自定义格式嵌套 JSON,并添加了标签
Nested JSON in customised format from pandas Dataframe, with added label
数据框
df = {"UNIT":["UNIT1","UNIT1","UNIT2","UNIT2"],
"PROJECT":["A","A","C","C"],
"TEAM":[1,2,1,2],
"NAME":["FANNY", "KATY", "PERCY", "PETER"],
"ID":[123,234,333,222]}
data = pd.DataFrame(df)
UNIT PROJECT TEAM NAME ID
0 UNIT1 A 1 FANNY 123
1 UNIT1 A 2 KATY 234
2 UNIT2 C 1 PERCY 333
3 UNIT2 C 2 PETER 222
预期输出
[
{
"UNIT": "UNIT1",
"PROJECT": "A",
"TEAM_DETAIL": [
{
"TEAM": 1,
"MEMBER": [
{
"NAME": "FANNY",
"ID": 123
}
]
},
{
"TEAM": "TEAM 2",
"MEMBER": [
{
"NAME": "KATY",
"ID": 234
}
]
}
]
},
{
"UNIT": "UNIT2",
"PROJECT": "C",
"TEAM_DETAIL": [
{
"TEAM": 1,
"MEMBER": [
{
"NAME": "PERCY",
"ID": 333
}
]
},
{
"TEAM": "TEAM 2",
"MEMBER": [
{
"NAME": "PETER",
"ID": 222
}
]
}
]
}
]
在这种情况下,我想按 TEAM
对数据进行分组,从而显示每个团队中每个成员的详细信息。
不添加自定义标签,例如 TEAM_DETAIL
和 MEMBER
,
使用 .to_dict()
可以轻松实现
但是,我不知道如何在每个级别上添加标签。
您必须使用第一个 groupby
创建 MEMBER
列表。然后您可以使用第二个 groupby
创建 TEAM_DETAIL
列表。
完整代码:
import pandas as pd
data = {"UNIT":["UNIT1","UNIT1","UNIT2","UNIT2"],
"PROJECT":["A","A","C","C"],
"TEAM":[1,2,1,2],
"NAME":["FANNY", "KATY", "PERCY", "PETER"],
"ID":[123,234,333,222]}
df = pd.DataFrame(data)
df
json = (df.groupby(['UNIT','PROJECT', 'TEAM'])
.apply(lambda x: x[['NAME','ID']].to_dict('records'))
.reset_index()
.rename(columns={0:'MEMBER'})
.groupby(['UNIT','PROJECT'])
.apply(lambda x: x[['TEAM','MEMBER']].to_dict('records'))
.reset_index()
.rename(columns={0:'TEAM_DETAIL'})
.to_json(orient='records'))
print(json)
输出:
'[{"UNIT":"UNIT1","PROJECT":"A","TEAM_DETAIL":[{"TEAM":1,"MEMBER":[{"NAME":"FANNY","ID":123}]},{"TEAM":2,"MEMBER":[{"NAME":"KATY","ID":234}]}]},{"UNIT":"UNIT2","PROJECT":"C","TEAM_DETAIL":[{"TEAM":1,"MEMBER":[{"NAME":"PERCY","ID":333}]},{"TEAM":2,"MEMBER":[{"NAME":"PETER","ID":222}]}]}]'
数据框
df = {"UNIT":["UNIT1","UNIT1","UNIT2","UNIT2"],
"PROJECT":["A","A","C","C"],
"TEAM":[1,2,1,2],
"NAME":["FANNY", "KATY", "PERCY", "PETER"],
"ID":[123,234,333,222]}
data = pd.DataFrame(df)
UNIT PROJECT TEAM NAME ID
0 UNIT1 A 1 FANNY 123
1 UNIT1 A 2 KATY 234
2 UNIT2 C 1 PERCY 333
3 UNIT2 C 2 PETER 222
预期输出
[
{
"UNIT": "UNIT1",
"PROJECT": "A",
"TEAM_DETAIL": [
{
"TEAM": 1,
"MEMBER": [
{
"NAME": "FANNY",
"ID": 123
}
]
},
{
"TEAM": "TEAM 2",
"MEMBER": [
{
"NAME": "KATY",
"ID": 234
}
]
}
]
},
{
"UNIT": "UNIT2",
"PROJECT": "C",
"TEAM_DETAIL": [
{
"TEAM": 1,
"MEMBER": [
{
"NAME": "PERCY",
"ID": 333
}
]
},
{
"TEAM": "TEAM 2",
"MEMBER": [
{
"NAME": "PETER",
"ID": 222
}
]
}
]
}
]
在这种情况下,我想按 TEAM
对数据进行分组,从而显示每个团队中每个成员的详细信息。
不添加自定义标签,例如 TEAM_DETAIL
和 MEMBER
,
使用 .to_dict()
可以轻松实现
但是,我不知道如何在每个级别上添加标签。
您必须使用第一个 groupby
创建 MEMBER
列表。然后您可以使用第二个 groupby
创建 TEAM_DETAIL
列表。
完整代码:
import pandas as pd
data = {"UNIT":["UNIT1","UNIT1","UNIT2","UNIT2"],
"PROJECT":["A","A","C","C"],
"TEAM":[1,2,1,2],
"NAME":["FANNY", "KATY", "PERCY", "PETER"],
"ID":[123,234,333,222]}
df = pd.DataFrame(data)
df
json = (df.groupby(['UNIT','PROJECT', 'TEAM'])
.apply(lambda x: x[['NAME','ID']].to_dict('records'))
.reset_index()
.rename(columns={0:'MEMBER'})
.groupby(['UNIT','PROJECT'])
.apply(lambda x: x[['TEAM','MEMBER']].to_dict('records'))
.reset_index()
.rename(columns={0:'TEAM_DETAIL'})
.to_json(orient='records'))
print(json)
输出:
'[{"UNIT":"UNIT1","PROJECT":"A","TEAM_DETAIL":[{"TEAM":1,"MEMBER":[{"NAME":"FANNY","ID":123}]},{"TEAM":2,"MEMBER":[{"NAME":"KATY","ID":234}]}]},{"UNIT":"UNIT2","PROJECT":"C","TEAM_DETAIL":[{"TEAM":1,"MEMBER":[{"NAME":"PERCY","ID":333}]},{"TEAM":2,"MEMBER":[{"NAME":"PETER","ID":222}]}]}]'