如何规范化包含许多不同集合的深层嵌套 JSON 结果?

How do I normalize deeply nested JSON results that has a lot of different collections?

我得到的是 monday.com 通过 graphql 查询得到的结果。返回的数据有很多不同的集合,例如字典列表的字典......例如。

{
  "data": {
    "boards": [
      {
        "name": "board name",
        "groups": [
                    {"title": "group title 1"},
                    {"title": "group title 2"},
                    {"title": "group title 3"},
                    {"title": "group title 4"}
                  ]
       }
     ]
   },
  "account_id": "0000000"
}

我想制作一个 pandas dataframe 喜欢

     group.board    group.title
0    'board_name'   'group title 1'
1    'board_name'   'group title 2'
2    'board_name'   'group title 3'
3    'board_name'   'group title 4'

我试过了

pd.json_normalize(json_data, 'boards')

但我不断收到 KeyError:'boards'

您的 json 字典无效(缺少右大括号,所以我修复了它)。我会这样做。在这里,我们将在 json_data["data"]["boards"] 处进入字典,因为那是数据所在的位置,使用 "groups" 作为 records 键,并使用字段 "name" 作为元数据字段之一.

import pandas as pd

json_data = {
    "data": {
        "boards": [
            {
                "name": "board name",
                "groups": [
                    { "title": "group title 1" },
                    { "title": "group title 2" },
                    { "title": "group title 3" },
                    { "title": "group title 4" }
                ]
            },
        ]
    },
    "account_id": "0000000"
}

pd.json_normalize(json_data["data"]["boards"], "groups", ["name"])

输出:


    title           name
0   group title 1   board name
1   group title 2   board name
2   group title 3   board name
3   group title 4   board name

你得到一个键错误,因为 json_data 没有键 boardsjson_data["data"] 可以,但您仍然得不到预期的结果。

您需要将 json_data["data"]["boards"] 作为字典列表传递,要求将“组”作为 record_path,将 ["name"] 作为 meta 路径:

>>> df = pd.json_normalize(json_data["data"]["boards"], "groups", ["name"])
           title        name
0  group title 1  board name
1  group title 2  board name
2  group title 3  board name
3  group title 4  board name

然后,您可以重命名列:

>>> df.columns = ["group.title", "group.board"]

     group.title group.board
0  group title 1  board name
1  group title 2  board name
2  group title 3  board name
3  group title 4  board name