Python pandas 中的深层嵌套 JSON 规范化
Deep Nested JSON Normalization in Python pandas
我想将深层嵌套的 JSON 文件转换为 CSV。 JSON 文件是
{ "abcd-performance:output": {
"performance": [
{
"sample": {
"objectId": "abcd-device:100:12",
"ifName": "Carrier1/1",
"timeStamp": "2021-10-20T03:15:00.000Z",
"type": "radio",
"invalid": false,
"attribute": [
{
"parameter": "rslReadingMean",
"value": -36.4
},
{
"parameter": "qam-32rx_time",
"value": 0
}
]
}
},
{
"sample": {
"objectId": "abcd-device:100:12",
"ifName": "Carrier1/1",
"timeStamp": "2021-10-20T03:30:00.000Z",
"type": "radio",
"invalid": false,
"attribute": [
{
"parameter": "rslReadingMean",
"value": -36.5
},
{
"parameter": "qam-32rx_time",
"value": 0
}
]
}
},
{
"sample": {
"objectId": "abcd-device:100:13",
"ifName": "Terminal",
"timeStamp": "2021-10-20T03:30:00.000Z",
"type": "sensor",
"invalid": false,
"attribute": [
{
"parameter": "InputCurrent",
"value": 1.14
},
{
"parameter": "Temperature",
"value": 61.5
},
{
"parameter": "InputVoltage",
"value": 54.777
}
]
}
}
] } }
代码
with open('Performance_Interface_data.json') as f:
data = json.load(f)
df = pd.json_normalize(data['abcd-performance:output'], 'performance', max_level=4)
print(df.columns)
print(df)
要求的输出是
sample_objectId;sample_ifName;sample_timeStamp;sample_type;sample_invalid;sample_attribute_parameter;sample_attribute_value
我无法对最后一列进行标准化。
您需要根据 JSON.
指定 meta
和 record_path
的参数
df = pd.json_normalize(data['abcd-performance:output']['performance'], record_path=['sample','attribute'], meta=[['sample', 'objectId'], ['sample', 'ifName'], ['sample', 'timeStamp'], ['sample', 'type'], ['sample', 'invalid']])
In [50]: df
Out[50]:
parameter value sample.objectId sample.ifName sample.timeStamp sample.type sample.invalid
0 rslReadingMean -36.400 abcd-device:100:12 Carrier1/1 2021-10-20T03:15:00.000Z radio False
1 qam-32rx_time 0.000 abcd-device:100:12 Carrier1/1 2021-10-20T03:15:00.000Z radio False
2 rslReadingMean -36.500 abcd-device:100:12 Carrier1/1 2021-10-20T03:30:00.000Z radio False
3 qam-32rx_time 0.000 abcd-device:100:12 Carrier1/1 2021-10-20T03:30:00.000Z radio False
4 InputCurrent 1.140 abcd-device:100:13 Terminal 2021-10-20T03:30:00.000Z sensor False
5 Temperature 61.500 abcd-device:100:13 Terminal 2021-10-20T03:30:00.000Z sensor False
6 InputVoltage 54.777 abcd-device:100:13 Terminal 2021-10-20T03:30:00.000Z sensor False
我想将深层嵌套的 JSON 文件转换为 CSV。 JSON 文件是
{ "abcd-performance:output": { "performance": [ { "sample": { "objectId": "abcd-device:100:12", "ifName": "Carrier1/1", "timeStamp": "2021-10-20T03:15:00.000Z", "type": "radio", "invalid": false, "attribute": [ { "parameter": "rslReadingMean", "value": -36.4 }, { "parameter": "qam-32rx_time", "value": 0 } ] } }, { "sample": { "objectId": "abcd-device:100:12", "ifName": "Carrier1/1", "timeStamp": "2021-10-20T03:30:00.000Z", "type": "radio", "invalid": false, "attribute": [ { "parameter": "rslReadingMean", "value": -36.5 }, { "parameter": "qam-32rx_time", "value": 0 } ] } }, { "sample": { "objectId": "abcd-device:100:13", "ifName": "Terminal", "timeStamp": "2021-10-20T03:30:00.000Z", "type": "sensor", "invalid": false, "attribute": [ { "parameter": "InputCurrent", "value": 1.14 }, { "parameter": "Temperature", "value": 61.5 }, { "parameter": "InputVoltage", "value": 54.777 } ] } } ] } }
代码
with open('Performance_Interface_data.json') as f:
data = json.load(f)
df = pd.json_normalize(data['abcd-performance:output'], 'performance', max_level=4)
print(df.columns)
print(df)
要求的输出是 sample_objectId;sample_ifName;sample_timeStamp;sample_type;sample_invalid;sample_attribute_parameter;sample_attribute_value
我无法对最后一列进行标准化。
您需要根据 JSON.
指定meta
和 record_path
的参数
df = pd.json_normalize(data['abcd-performance:output']['performance'], record_path=['sample','attribute'], meta=[['sample', 'objectId'], ['sample', 'ifName'], ['sample', 'timeStamp'], ['sample', 'type'], ['sample', 'invalid']])
In [50]: df
Out[50]:
parameter value sample.objectId sample.ifName sample.timeStamp sample.type sample.invalid
0 rslReadingMean -36.400 abcd-device:100:12 Carrier1/1 2021-10-20T03:15:00.000Z radio False
1 qam-32rx_time 0.000 abcd-device:100:12 Carrier1/1 2021-10-20T03:15:00.000Z radio False
2 rslReadingMean -36.500 abcd-device:100:12 Carrier1/1 2021-10-20T03:30:00.000Z radio False
3 qam-32rx_time 0.000 abcd-device:100:12 Carrier1/1 2021-10-20T03:30:00.000Z radio False
4 InputCurrent 1.140 abcd-device:100:13 Terminal 2021-10-20T03:30:00.000Z sensor False
5 Temperature 61.500 abcd-device:100:13 Terminal 2021-10-20T03:30:00.000Z sensor False
6 InputVoltage 54.777 abcd-device:100:13 Terminal 2021-10-20T03:30:00.000Z sensor False