JMESPath 表达式展平对象数组,每个对象都有嵌套的对象数组

JMESPath expression to flatten array of objects, each with nested arrays of objects

我有 JSON 包含一组数据库,每个数据库都有一组用户,例如

{"databases": [
  {"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
  {"db": "db_b", "users": [{"name": "bob"}, {"name": "brienne"}]}
]}

我想生成数据库和用户的平面数组,即

[
  {"db": "db_a", "name": "alice"},
  {"db": "db_a", "name": "alex"},
  {"db": "db_b", "name": "bob"},
  {"db": "db_b", "name": "brienne"}
]

在 SQL 术语中,这将是笛卡尔连接或笛卡尔积,但我不确定树结构中的正确术语。到目前为止我得到的最接近的是

databases[].users[]

产生

[{"name": "alice"}, {"name": "alex"}, {"name": "bob"}, {"name": "brienne"}]

databases[].{db: db, name: users[].name}

产生

[
  {"db": "db_a", "name": ["alice", "alex"]},
  {"db": "db_b", "name": ["bob", "brienne"]}
]

附录:我很高兴接受 "You can't do that with JMESPath, here's why ..." 作为答案。 HN Comment`` 暗示了这一点

can't reference parents when doing iteration. Why? All options for iteration, [* ] and map, all use the iterated item as the context for any expression. There's no opportunity to get any other values in

一个选项是loop subelements

  tasks:
    - set_fact:
        my_db: "{{ my_db + [ item.0|combine(item.1) ] }}"
      loop: "{{ lookup('subelements',databases,'users') }}"

您不能仅使用 JMESPath 来执行此操作,因为 JMESPath 表达式只能引用单个范围。当当前范围是用户对象时,无法到达外部范围(数据库对象)。 JEP 11 将允许访问其他范围,但几年后仍未被接受。

在 Ansible 上可以使用其他过滤器(h/t Vladimir)和一些丑陋的东西

databases_users: "{{ 
    databases | subelements('users')
              | to_json | from_json
              | json_query('[*].{db: [0].db, name: [1].name}')
}}"

说明

提醒一下,我们的起点是

[ {"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
  ...]

subelements 过滤器将其转换为 Python 元组对的列表

[ ({"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
   {"name": "alice"}),
  ...]

to_jsonfrom_json 将元组对转换为列表(Python 的 JMESPath 忽略元组)

[ [{"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
   {"name": "alice"}],
  ...]

json_query 选择所需的 dbuser

[ {"db": "db_a", "name": "alice"},
  ...]

不确定它是否是一个选项,但自定义函数可以这样做:

import json
import jmespath


class CustomFunctions(jmespath.functions.Functions):
    @jmespath.functions.signature({'types': ['object']}, {'types': ['array']})
    def _func_map_merge(self, obj, arg):
        result = []
        for element in arg:
            merged_object = super()._func_merge(obj, element)
            result.append(merged_object)
        return result


options = jmespath.Options(custom_functions=CustomFunctions())


source = """
{"databases": [
  {"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
  {"db": "db_b", "users": [{"name": "bob"}, {"name": "brienne"}]}
]}

"""

jmespath_expr = """
    databases[].map_merge({"db": db}, @.users[])[]
"""

result = jmespath.search(jmespath_expr, json.loads(source), options=options)
result

制作中

[{'db': 'db_a', 'name': 'alice'},
 {'db': 'db_a', 'name': 'alex'},
 {'db': 'db_b', 'name': 'bob'},
 {'db': 'db_b', 'name': 'brienne'}]