将数组和嵌套数组中的 JSON 个值写入单个 CSV
Write JSON values from array and nested array to single CSV
我有一个 JSON 输出,我想在其中创建一个包含两列的 csv 文件。第一列应包含 userId,第二列应包含 videoSeries 的值。输出如下所示:
{
"start": 1490383076,
"stop": 1492975076,
"events": [
{
"time": 1491294219,
"customParameters": [
{
"group": "channelId",
"item": "dr3"
},
{
"group": "videoGenre",
"item": "unknown"
},
{
"group": "videoSeries",
"item": "min-mor-er-pink"
},
{
"group": "videoSlug",
"item": "min-mor-er-pink"
}
],
"userId": "cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16"
}
],
}
我的 csv 应该是这样的:
--------------------------------------------------------------
User ID videoSeries
--------------------------------------------------------------
cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16 min-mor-er-pink
--------------------------------------------------------------
我已经尝试使用 ijson 和 pandas 来获得所需的输出,但我无法将两个不同数组的值放入单个 csv
import ijson
import pandas as pd
with open('MY JSON FILE', 'r') as f:
objects = ijson.items(f, 'events.item')
pandaReadable = list(objects)
df = pd.DataFrame(pandaReadable, columns=['userId', 'customParameters'])
df.to_csv('C:/Users/.../Desktop/output.csv', columns=['userId', 'customParameters'], index=False)
试试这个方法:
d
是根据您的 JSON:
构建的字典
In [150]: d
Out[150]:
{'events': [{'customParameters': [{'group': 'channelId', 'item': 'dr3'},
{'group': 'videoGenre', 'item': 'unknown'},
{'group': 'videoSeries', 'item': 'min-mor-er-pink'},
{'group': 'videoSlug', 'item': 'min-mor-er-pink'}],
'time': 1491294219,
'userId': 'cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16'}],
'start': 1490383076,
'stop': 1492975076}
解决方案:
In [153]: pd.io.json.json_normalize(d['events'], 'customParameters', ['userId']) \
...: .query("group in ['videoSeries']")[['userId','item']]
...:
Out[153]:
userId item
2 cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16 min-mor-er-pink
如果您需要 videoSeries
作为列名:
In [154]: pd.io.json.json_normalize(d['events'], 'customParameters', ['userId']) \
...: .query("group in ['videoSeries']")[['userId','item']] \
...: .rename(columns={'item':'videoSeries'})
...:
Out[154]:
userId videoSeries
2 cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16 min-mor-er-pink
我有一个 JSON 输出,我想在其中创建一个包含两列的 csv 文件。第一列应包含 userId,第二列应包含 videoSeries 的值。输出如下所示:
{
"start": 1490383076,
"stop": 1492975076,
"events": [
{
"time": 1491294219,
"customParameters": [
{
"group": "channelId",
"item": "dr3"
},
{
"group": "videoGenre",
"item": "unknown"
},
{
"group": "videoSeries",
"item": "min-mor-er-pink"
},
{
"group": "videoSlug",
"item": "min-mor-er-pink"
}
],
"userId": "cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16"
}
],
}
我的 csv 应该是这样的:
--------------------------------------------------------------
User ID videoSeries
--------------------------------------------------------------
cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16 min-mor-er-pink
--------------------------------------------------------------
我已经尝试使用 ijson 和 pandas 来获得所需的输出,但我无法将两个不同数组的值放入单个 csv
import ijson
import pandas as pd
with open('MY JSON FILE', 'r') as f:
objects = ijson.items(f, 'events.item')
pandaReadable = list(objects)
df = pd.DataFrame(pandaReadable, columns=['userId', 'customParameters'])
df.to_csv('C:/Users/.../Desktop/output.csv', columns=['userId', 'customParameters'], index=False)
试试这个方法:
d
是根据您的 JSON:
In [150]: d
Out[150]:
{'events': [{'customParameters': [{'group': 'channelId', 'item': 'dr3'},
{'group': 'videoGenre', 'item': 'unknown'},
{'group': 'videoSeries', 'item': 'min-mor-er-pink'},
{'group': 'videoSlug', 'item': 'min-mor-er-pink'}],
'time': 1491294219,
'userId': 'cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16'}],
'start': 1490383076,
'stop': 1492975076}
解决方案:
In [153]: pd.io.json.json_normalize(d['events'], 'customParameters', ['userId']) \
...: .query("group in ['videoSeries']")[['userId','item']]
...:
Out[153]:
userId item
2 cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16 min-mor-er-pink
如果您需要 videoSeries
作为列名:
In [154]: pd.io.json.json_normalize(d['events'], 'customParameters', ['userId']) \
...: .query("group in ['videoSeries']")[['userId','item']] \
...: .rename(columns={'item':'videoSeries'})
...:
Out[154]:
userId videoSeries
2 cx:hr1y0kcbhhr61qj7kspglu767:344xy3wb5bz16 min-mor-er-pink