将 DataFrame 转储到 JSON 条记录
Dumping DataFrame to JSON records
我有一个数据框 df
作为:
task_count task date
0 82586 foo 2015-10-31
1 57417 foo 2016-08-31
2 47800 bar 2016-12-31
3 62331 foo 2016-02-29
4 45852 bar 2017-07-31
我想生成如下输出:
[
{
"task": "foo",
"task_count": [82586,57417,62331],
"date": ["2015-10-31","2016-08-31","2016-02-29"]
},
{
"task": "bar",
"task_count": [47800,45852],
"date": ["2016-12-31","2017-07-31"]
}
]
到目前为止,这就是我所做的,但我无法对多列执行 groupby。
result = df.groupby('task')['task_count'].apply(list).reset_index().to_json(orient='records')
print(json.dumps(json.loads(result),indent=2)
我应该采用什么方法来获得所需的输出?
您可以使用 groupby
+ agg
+ to_dict
-
df.groupby('task', as_index=False).agg(lambda x: x.tolist()).to_dict('r')
[
{
"date": [
"2016-12-31",
"2017-07-31"
],
"task_count": [
47800,
45852
],
"task": "bar"
},
{
"date": [
"2015-10-31",
"2016-08-31",
"2016-02-29"
],
"task_count": [
82586,
57417,
62331
],
"task": "foo"
}
]
如果要生成 JSON 并将结果转储到文件中,请使用 to_json
而不是 to_dict
-
df.groupby('task', as_index=False)\
.agg(lambda x: x.tolist())\
.to_json('file.json', orient='records')
这会创建一个 file.json
包含 -
[{"task":"bar","task_count":[47800,45852],"date":["2016-12-31","2017-07-31"]},{"task":"foo","task_count":[82586,57417,62331],"date":["2015-10-31","2016-08-31","2016-02-29"]}]'
我有一个数据框 df
作为:
task_count task date
0 82586 foo 2015-10-31
1 57417 foo 2016-08-31
2 47800 bar 2016-12-31
3 62331 foo 2016-02-29
4 45852 bar 2017-07-31
我想生成如下输出:
[
{
"task": "foo",
"task_count": [82586,57417,62331],
"date": ["2015-10-31","2016-08-31","2016-02-29"]
},
{
"task": "bar",
"task_count": [47800,45852],
"date": ["2016-12-31","2017-07-31"]
}
]
到目前为止,这就是我所做的,但我无法对多列执行 groupby。
result = df.groupby('task')['task_count'].apply(list).reset_index().to_json(orient='records')
print(json.dumps(json.loads(result),indent=2)
我应该采用什么方法来获得所需的输出?
您可以使用 groupby
+ agg
+ to_dict
-
df.groupby('task', as_index=False).agg(lambda x: x.tolist()).to_dict('r')
[
{
"date": [
"2016-12-31",
"2017-07-31"
],
"task_count": [
47800,
45852
],
"task": "bar"
},
{
"date": [
"2015-10-31",
"2016-08-31",
"2016-02-29"
],
"task_count": [
82586,
57417,
62331
],
"task": "foo"
}
]
如果要生成 JSON 并将结果转储到文件中,请使用 to_json
而不是 to_dict
-
df.groupby('task', as_index=False)\
.agg(lambda x: x.tolist())\
.to_json('file.json', orient='records')
这会创建一个 file.json
包含 -
[{"task":"bar","task_count":[47800,45852],"date":["2016-12-31","2017-07-31"]},{"task":"foo","task_count":[82586,57417,62331],"date":["2015-10-31","2016-08-31","2016-02-29"]}]'