转换嵌套 JSON 以清除 pd 数据帧

Question

我有一个 JSON 文件，我想将其转换为有用的 pd.DataFrame，以便我可以将其用于进一步建模。 JSON 文件如下所示：

json_file = {
  "x1": [
    {
      "a": "XZ12ABC1834",
      "b": "J. Doe",
      "c": [
        {
          "Amount": -50,
          "Date": "2021-08-15T10:00:00.000Z",
          "CategoryId": "abc123",
          "CounterParty": "The Farm",
          "Description": "some description",
          "Counter": "XYZ456AZ",
          "Type": "bc"
        },{
          "Amount": -1,
          "Date": "2020-08-15T10:00:00.000Z",
          "CategoryId": "cde123",
          "CounterParty": "The pool",
          "Description": "some other description",
          "Counter": "WYZ12",
          "Type": "X"
        }
         ]
      "a": "XX34XX872",
      "b": "J. Doe",
      "c": [
        {
          "Amount": -1,50,
          "Date": "2019-05-15T10:00:00.000Z",
          "CategoryId": "QWR627",
          "CounterParty": "The City",
          "Description": "last other description",
          "Counter": "QWE123",
          "Type": "S"
        }
      ]
    }
  ]
}

我想将这个 JSON 文件转换成一个数据框，看起来像这样：

var1	a	b	amount	date	CategoryID	Counterparty	Description	Counter	Type
x1	XZ12ABC1834	J. Doe	-50	2021-08-15T10:00:00.000Z	abc123	The Farm	some description	XYZ456AZ	bv
x1	XZ12ABC1834	J. Doe	-1	2020-08-15T10:00:00.000Z	cde123	The pool	some other description	WYZZ12	X
x1	XX34XX872	J. Doe	-1.50	2019-05-15T10:00:00.000Z	cde123	The city	last other description	QWE123	S

希望这些信息足以帮助我解决这个问题。

Answer 1

我认为这样的事情应该可行：

import pandas as pd

result = []

for key in json_file:
  df_nested_list = pd.json_normalize(
    json_file[key], 
    record_path =['c'], 
    meta=['a', 'b']
  )
  df_nested_list['var1'] = key
  result.append(df_nested_list)
pd.concat(result)

有关详细信息，请查看：https://towardsdatascience.com/how-to-convert-json-into-a-pandas-dataframe-100b2ae1e0d8

转换嵌套 JSON 以清除 pd 数据帧

Converting nested JSON to clear pd dataframe

python

json

nested

list

dataframe