嵌套 JSON 解压到 PD 数据帧
Nested JSON unpack into PD dataframe
我想就我拥有的这个 .json 文件寻求帮助。
我已经广泛查看了 pd.json_normalize() 方法,但无法正确设置格式。
我开始试验的代码行是这样的
'''
result_df = pd.json_normalize(cgcryptohistory_data)
'''
我很想将我的 json 格式化为格式如下的 df:
date
bitcoin prices
bitcoin market_caps
bitcoin total_volumes
ethereum prices
ethereum market_caps
ethereum total_volumes
1637920962758
55084.24409740329
1040185692035.8112
4096.986983019884
...
...
1637924583096
...
...
...
...
...
...
我一直在查看此文档,但无法使其与未命名的嵌套值一起使用。 https://pandas.pydata.org/pandas-docs/version/1.2.0/reference/api/pandas.json_normalize.html
https://www.kaggle.com/jboysen/quick-tutorial-flatten-nested-json-in-pandas/notebook
[
[
{
"crypto": "bitcoin"
}
],
{
"prices": [
[
1637920962758,
55084.24409740329
],
[
1637924583096,
54657.9826454445
],
[
1637928143387,
54031.99796233907
],
[
1638524408000,
56556.355173823926
]
],
"market_caps": [
[
1637920962758,
1040185692035.8112
],
[
1637924583096,
1032137732028.0712
],
[
1637928143387,
1020318960913.6139
],
[
1638524408000,
1068341065780.2579
]
],
"total_volumes": [
[
1637920962758,
40002799175.46155
],
[
1637924583096,
38579701553.8867
],
[
1637928143387,
39373185822.85809
],
[
1638524408000,
32567680716.236423
]
]
},
[
{
"crypto": "ethereum"
}
],
{
"prices": [
[
1637920951704,
4096.986983019884
],
[
1637924408082,
4072.6963895955864
],
[
1637928090810,
4021.2930336538925
],
[
1638524390000,
4559.839444343959
]
],
"market_caps": [
[
1637920951704,
485474079335.9266
],
[
1637924408082,
482758573953.61304
],
[
1637928090810,
479260985689.3548
],
[
1638524390000,
540740261905.95264
]
],
"total_volumes": [
[
1637920951704,
25972933719.35031
],
[
1637924408082,
26468521371.13646
],
[
1637928090810,
27042124946.11916
],
[
1638524390000,
20268892519.524815
]
]
}
]
假设 js 是您的 json,这就是我的做法。
l = []
for i in range(0,len(js),2):
prices = [k[1] for k in js[i+1]["prices"]]
market_caps = [k[1] for k in js[i+1]["market_caps"]]
total_volumes = [k[1] for k in js[i+1]["total_volumes"]]
date = [k[0] for k in js[i+1]["total_volumes"]]
crypto = js[i][0]["crypto"]
df = pd.DataFrame({"crypto":crypto,"prices":prices,"market_caps":market_caps,"total_volumes":total_volumes,"date":date})
l.append(df)
df = pd.concat(l)
输出:
crypto prices market_caps total_volumes date
0 bitcoin 55084.244097 1.040186e+12 4.000280e+10 1637920962758
1 bitcoin 54657.982645 1.032138e+12 3.857970e+10 1637924583096
2 bitcoin 54031.997962 1.020319e+12 3.937319e+10 1637928143387
3 bitcoin 56556.355174 1.068341e+12 3.256768e+10 1638524408000
0 ethereum 4096.986983 4.854741e+11 2.597293e+10 1637920951704
1 ethereum 4072.696390 4.827586e+11 2.646852e+10 1637924408082
2 ethereum 4021.293034 4.792610e+11 2.704212e+10 1637928090810
3 ethereum 4559.839444 5.407403e+11 2.026889e+10 1638524390000
这样它的可扩展性更强,你可以像这样过滤你想要的加密货币:
df[df.crypto == "bitcoin"]
输出
crypto prices market_caps total_volumes date
0 bitcoin 55084.244097 1.040186e+12 4.000280e+10 1637920962758
1 bitcoin 54657.982645 1.032138e+12 3.857970e+10 1637924583096
2 bitcoin 54031.997962 1.020319e+12 3.937319e+10 1637928143387
3 bitcoin 56556.355174 1.068341e+12 3.256768e+10 1638524408000
我想就我拥有的这个 .json 文件寻求帮助。 我已经广泛查看了 pd.json_normalize() 方法,但无法正确设置格式。
我开始试验的代码行是这样的 ''' result_df = pd.json_normalize(cgcryptohistory_data) '''
我很想将我的 json 格式化为格式如下的 df:
date | bitcoin prices | bitcoin market_caps | bitcoin total_volumes | ethereum prices | ethereum market_caps | ethereum total_volumes | |
---|---|---|---|---|---|---|---|
1637920962758 | 55084.24409740329 | 1040185692035.8112 | 4096.986983019884 | ... | ... | ||
1637924583096 | ... | ... | ... | ... | ... | ... |
我一直在查看此文档,但无法使其与未命名的嵌套值一起使用。 https://pandas.pydata.org/pandas-docs/version/1.2.0/reference/api/pandas.json_normalize.html https://www.kaggle.com/jboysen/quick-tutorial-flatten-nested-json-in-pandas/notebook
[
[
{
"crypto": "bitcoin"
}
],
{
"prices": [
[
1637920962758,
55084.24409740329
],
[
1637924583096,
54657.9826454445
],
[
1637928143387,
54031.99796233907
],
[
1638524408000,
56556.355173823926
]
],
"market_caps": [
[
1637920962758,
1040185692035.8112
],
[
1637924583096,
1032137732028.0712
],
[
1637928143387,
1020318960913.6139
],
[
1638524408000,
1068341065780.2579
]
],
"total_volumes": [
[
1637920962758,
40002799175.46155
],
[
1637924583096,
38579701553.8867
],
[
1637928143387,
39373185822.85809
],
[
1638524408000,
32567680716.236423
]
]
},
[
{
"crypto": "ethereum"
}
],
{
"prices": [
[
1637920951704,
4096.986983019884
],
[
1637924408082,
4072.6963895955864
],
[
1637928090810,
4021.2930336538925
],
[
1638524390000,
4559.839444343959
]
],
"market_caps": [
[
1637920951704,
485474079335.9266
],
[
1637924408082,
482758573953.61304
],
[
1637928090810,
479260985689.3548
],
[
1638524390000,
540740261905.95264
]
],
"total_volumes": [
[
1637920951704,
25972933719.35031
],
[
1637924408082,
26468521371.13646
],
[
1637928090810,
27042124946.11916
],
[
1638524390000,
20268892519.524815
]
]
}
]
假设 js 是您的 json,这就是我的做法。
l = []
for i in range(0,len(js),2):
prices = [k[1] for k in js[i+1]["prices"]]
market_caps = [k[1] for k in js[i+1]["market_caps"]]
total_volumes = [k[1] for k in js[i+1]["total_volumes"]]
date = [k[0] for k in js[i+1]["total_volumes"]]
crypto = js[i][0]["crypto"]
df = pd.DataFrame({"crypto":crypto,"prices":prices,"market_caps":market_caps,"total_volumes":total_volumes,"date":date})
l.append(df)
df = pd.concat(l)
输出:
crypto prices market_caps total_volumes date
0 bitcoin 55084.244097 1.040186e+12 4.000280e+10 1637920962758
1 bitcoin 54657.982645 1.032138e+12 3.857970e+10 1637924583096
2 bitcoin 54031.997962 1.020319e+12 3.937319e+10 1637928143387
3 bitcoin 56556.355174 1.068341e+12 3.256768e+10 1638524408000
0 ethereum 4096.986983 4.854741e+11 2.597293e+10 1637920951704
1 ethereum 4072.696390 4.827586e+11 2.646852e+10 1637924408082
2 ethereum 4021.293034 4.792610e+11 2.704212e+10 1637928090810
3 ethereum 4559.839444 5.407403e+11 2.026889e+10 1638524390000
这样它的可扩展性更强,你可以像这样过滤你想要的加密货币:
df[df.crypto == "bitcoin"]
输出
crypto prices market_caps total_volumes date
0 bitcoin 55084.244097 1.040186e+12 4.000280e+10 1637920962758
1 bitcoin 54657.982645 1.032138e+12 3.857970e+10 1637924583096
2 bitcoin 54031.997962 1.020319e+12 3.937319e+10 1637928143387
3 bitcoin 56556.355174 1.068341e+12 3.256768e+10 1638524408000