转换嵌套 JSON 以清除 pd 数据帧
Converting nested JSON to clear pd dataframe
我有一个 JSON 文件,我想将其转换为有用的 pd.DataFrame
,以便我可以将其用于进一步建模。 JSON 文件如下所示:
json_file = {
"x1": [
{
"a": "XZ12ABC1834",
"b": "J. Doe",
"c": [
{
"Amount": -50,
"Date": "2021-08-15T10:00:00.000Z",
"CategoryId": "abc123",
"CounterParty": "The Farm",
"Description": "some description",
"Counter": "XYZ456AZ",
"Type": "bc"
},{
"Amount": -1,
"Date": "2020-08-15T10:00:00.000Z",
"CategoryId": "cde123",
"CounterParty": "The pool",
"Description": "some other description",
"Counter": "WYZ12",
"Type": "X"
}
]
"a": "XX34XX872",
"b": "J. Doe",
"c": [
{
"Amount": -1,50,
"Date": "2019-05-15T10:00:00.000Z",
"CategoryId": "QWR627",
"CounterParty": "The City",
"Description": "last other description",
"Counter": "QWE123",
"Type": "S"
}
]
}
]
}
我想将这个 JSON 文件转换成一个数据框,看起来像这样:
var1
a
b
amount
date
CategoryID
Counterparty
Description
Counter
Type
x1
XZ12ABC1834
J. Doe
-50
2021-08-15T10:00:00.000Z
abc123
The Farm
some description
XYZ456AZ
bv
x1
XZ12ABC1834
J. Doe
-1
2020-08-15T10:00:00.000Z
cde123
The pool
some other description
WYZZ12
X
x1
XX34XX872
J. Doe
-1.50
2019-05-15T10:00:00.000Z
cde123
The city
last other description
QWE123
S
希望这些信息足以帮助我解决这个问题。
我认为这样的事情应该可行:
import pandas as pd
result = []
for key in json_file:
df_nested_list = pd.json_normalize(
json_file[key],
record_path =['c'],
meta=['a', 'b']
)
df_nested_list['var1'] = key
result.append(df_nested_list)
pd.concat(result)
有关详细信息,请查看:https://towardsdatascience.com/how-to-convert-json-into-a-pandas-dataframe-100b2ae1e0d8
我有一个 JSON 文件,我想将其转换为有用的 pd.DataFrame
,以便我可以将其用于进一步建模。 JSON 文件如下所示:
json_file = {
"x1": [
{
"a": "XZ12ABC1834",
"b": "J. Doe",
"c": [
{
"Amount": -50,
"Date": "2021-08-15T10:00:00.000Z",
"CategoryId": "abc123",
"CounterParty": "The Farm",
"Description": "some description",
"Counter": "XYZ456AZ",
"Type": "bc"
},{
"Amount": -1,
"Date": "2020-08-15T10:00:00.000Z",
"CategoryId": "cde123",
"CounterParty": "The pool",
"Description": "some other description",
"Counter": "WYZ12",
"Type": "X"
}
]
"a": "XX34XX872",
"b": "J. Doe",
"c": [
{
"Amount": -1,50,
"Date": "2019-05-15T10:00:00.000Z",
"CategoryId": "QWR627",
"CounterParty": "The City",
"Description": "last other description",
"Counter": "QWE123",
"Type": "S"
}
]
}
]
}
我想将这个 JSON 文件转换成一个数据框,看起来像这样:
var1 | a | b | amount | date | CategoryID | Counterparty | Description | Counter | Type |
---|---|---|---|---|---|---|---|---|---|
x1 | XZ12ABC1834 | J. Doe | -50 | 2021-08-15T10:00:00.000Z | abc123 | The Farm | some description | XYZ456AZ | bv |
x1 | XZ12ABC1834 | J. Doe | -1 | 2020-08-15T10:00:00.000Z | cde123 | The pool | some other description | WYZZ12 | X |
x1 | XX34XX872 | J. Doe | -1.50 | 2019-05-15T10:00:00.000Z | cde123 | The city | last other description | QWE123 | S |
希望这些信息足以帮助我解决这个问题。
我认为这样的事情应该可行:
import pandas as pd
result = []
for key in json_file:
df_nested_list = pd.json_normalize(
json_file[key],
record_path =['c'],
meta=['a', 'b']
)
df_nested_list['var1'] = key
result.append(df_nested_list)
pd.concat(result)
有关详细信息,请查看:https://towardsdatascience.com/how-to-convert-json-into-a-pandas-dataframe-100b2ae1e0d8