转换嵌套 JSON 以清除 pd 数据帧

Converting nested JSON to clear pd dataframe

我有一个 JSON 文件,我想将其转换为有用的 pd.DataFrame,以便我可以将其用于进一步建模。 JSON 文件如下所示:

json_file = {
  "x1": [
    {
      "a": "XZ12ABC1834",
      "b": "J. Doe",
      "c": [
        {
          "Amount": -50,
          "Date": "2021-08-15T10:00:00.000Z",
          "CategoryId": "abc123",
          "CounterParty": "The Farm",
          "Description": "some description",
          "Counter": "XYZ456AZ",
          "Type": "bc"
        },{
          "Amount": -1,
          "Date": "2020-08-15T10:00:00.000Z",
          "CategoryId": "cde123",
          "CounterParty": "The pool",
          "Description": "some other description",
          "Counter": "WYZ12",
          "Type": "X"
        }
         ]
      "a": "XX34XX872",
      "b": "J. Doe",
      "c": [
        {
          "Amount": -1,50,
          "Date": "2019-05-15T10:00:00.000Z",
          "CategoryId": "QWR627",
          "CounterParty": "The City",
          "Description": "last other description",
          "Counter": "QWE123",
          "Type": "S"
        }
      ]
    }
  ]
}

我想将这个 JSON 文件转换成一个数据框,看起来像这样:

var1 a b amount date CategoryID Counterparty Description Counter Type
x1 XZ12ABC1834 J. Doe -50 2021-08-15T10:00:00.000Z abc123 The Farm some description XYZ456AZ bv
x1 XZ12ABC1834 J. Doe -1 2020-08-15T10:00:00.000Z cde123 The pool some other description WYZZ12 X
x1 XX34XX872 J. Doe -1.50 2019-05-15T10:00:00.000Z cde123 The city last other description QWE123 S

希望这些信息足以帮助我解决这个问题。

我认为这样的事情应该可行:

import pandas as pd

result = []

for key in json_file:
  df_nested_list = pd.json_normalize(
    json_file[key], 
    record_path =['c'], 
    meta=['a', 'b']
  )
  df_nested_list['var1'] = key
  result.append(df_nested_list)
pd.concat(result)

有关详细信息,请查看:https://towardsdatascience.com/how-to-convert-json-into-a-pandas-dataframe-100b2ae1e0d8