JMESPath 表达式展平对象数组,每个对象都有嵌套的对象数组
JMESPath expression to flatten array of objects, each with nested arrays of objects
我有 JSON 包含一组数据库,每个数据库都有一组用户,例如
{"databases": [
{"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
{"db": "db_b", "users": [{"name": "bob"}, {"name": "brienne"}]}
]}
我想生成数据库和用户的平面数组,即
[
{"db": "db_a", "name": "alice"},
{"db": "db_a", "name": "alex"},
{"db": "db_b", "name": "bob"},
{"db": "db_b", "name": "brienne"}
]
在 SQL 术语中,这将是笛卡尔连接或笛卡尔积,但我不确定树结构中的正确术语。到目前为止我得到的最接近的是
databases[].users[]
产生
[{"name": "alice"}, {"name": "alex"}, {"name": "bob"}, {"name": "brienne"}]
和
databases[].{db: db, name: users[].name}
产生
[
{"db": "db_a", "name": ["alice", "alex"]},
{"db": "db_b", "name": ["bob", "brienne"]}
]
附录:我很高兴接受 "You can't do that with JMESPath, here's why ..." 作为答案。 HN Comment`` 暗示了这一点
can't reference parents when doing iteration. Why? All options for iteration, [* ] and map, all use the iterated item as the context for any expression. There's no opportunity to get any other values in
一个选项是loop subelements
tasks:
- set_fact:
my_db: "{{ my_db + [ item.0|combine(item.1) ] }}"
loop: "{{ lookup('subelements',databases,'users') }}"
您不能仅使用 JMESPath 来执行此操作,因为 JMESPath 表达式只能引用单个范围。当当前范围是用户对象时,无法到达外部范围(数据库对象)。 JEP 11 将允许访问其他范围,但几年后仍未被接受。
在 Ansible 上可以使用其他过滤器(h/t Vladimir)和一些丑陋的东西
databases_users: "{{
databases | subelements('users')
| to_json | from_json
| json_query('[*].{db: [0].db, name: [1].name}')
}}"
说明
提醒一下,我们的起点是
[ {"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
...]
subelements
过滤器将其转换为 Python 元组对的列表
[ ({"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
{"name": "alice"}),
...]
to_json
和 from_json
将元组对转换为列表(Python 的 JMESPath 忽略元组)
[ [{"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
{"name": "alice"}],
...]
json_query
选择所需的 db
和 user
值
[ {"db": "db_a", "name": "alice"},
...]
不确定它是否是一个选项,但自定义函数可以这样做:
import json
import jmespath
class CustomFunctions(jmespath.functions.Functions):
@jmespath.functions.signature({'types': ['object']}, {'types': ['array']})
def _func_map_merge(self, obj, arg):
result = []
for element in arg:
merged_object = super()._func_merge(obj, element)
result.append(merged_object)
return result
options = jmespath.Options(custom_functions=CustomFunctions())
source = """
{"databases": [
{"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
{"db": "db_b", "users": [{"name": "bob"}, {"name": "brienne"}]}
]}
"""
jmespath_expr = """
databases[].map_merge({"db": db}, @.users[])[]
"""
result = jmespath.search(jmespath_expr, json.loads(source), options=options)
result
制作中
[{'db': 'db_a', 'name': 'alice'},
{'db': 'db_a', 'name': 'alex'},
{'db': 'db_b', 'name': 'bob'},
{'db': 'db_b', 'name': 'brienne'}]
我有 JSON 包含一组数据库,每个数据库都有一组用户,例如
{"databases": [
{"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
{"db": "db_b", "users": [{"name": "bob"}, {"name": "brienne"}]}
]}
我想生成数据库和用户的平面数组,即
[
{"db": "db_a", "name": "alice"},
{"db": "db_a", "name": "alex"},
{"db": "db_b", "name": "bob"},
{"db": "db_b", "name": "brienne"}
]
在 SQL 术语中,这将是笛卡尔连接或笛卡尔积,但我不确定树结构中的正确术语。到目前为止我得到的最接近的是
databases[].users[]
产生
[{"name": "alice"}, {"name": "alex"}, {"name": "bob"}, {"name": "brienne"}]
和
databases[].{db: db, name: users[].name}
产生
[
{"db": "db_a", "name": ["alice", "alex"]},
{"db": "db_b", "name": ["bob", "brienne"]}
]
附录:我很高兴接受 "You can't do that with JMESPath, here's why ..." 作为答案。 HN Comment`` 暗示了这一点
can't reference parents when doing iteration. Why? All options for iteration, [* ] and map, all use the iterated item as the context for any expression. There's no opportunity to get any other values in
一个选项是loop subelements
tasks:
- set_fact:
my_db: "{{ my_db + [ item.0|combine(item.1) ] }}"
loop: "{{ lookup('subelements',databases,'users') }}"
您不能仅使用 JMESPath 来执行此操作,因为 JMESPath 表达式只能引用单个范围。当当前范围是用户对象时,无法到达外部范围(数据库对象)。 JEP 11 将允许访问其他范围,但几年后仍未被接受。
在 Ansible 上可以使用其他过滤器(h/t Vladimir)和一些丑陋的东西
databases_users: "{{
databases | subelements('users')
| to_json | from_json
| json_query('[*].{db: [0].db, name: [1].name}')
}}"
说明
提醒一下,我们的起点是
[ {"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
...]
subelements
过滤器将其转换为 Python 元组对的列表
[ ({"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
{"name": "alice"}),
...]
to_json
和 from_json
将元组对转换为列表(Python 的 JMESPath 忽略元组)
[ [{"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
{"name": "alice"}],
...]
json_query
选择所需的 db
和 user
值
[ {"db": "db_a", "name": "alice"},
...]
不确定它是否是一个选项,但自定义函数可以这样做:
import json
import jmespath
class CustomFunctions(jmespath.functions.Functions):
@jmespath.functions.signature({'types': ['object']}, {'types': ['array']})
def _func_map_merge(self, obj, arg):
result = []
for element in arg:
merged_object = super()._func_merge(obj, element)
result.append(merged_object)
return result
options = jmespath.Options(custom_functions=CustomFunctions())
source = """
{"databases": [
{"db": "db_a", "users": [{"name": "alice"}, {"name": "alex"}]},
{"db": "db_b", "users": [{"name": "bob"}, {"name": "brienne"}]}
]}
"""
jmespath_expr = """
databases[].map_merge({"db": db}, @.users[])[]
"""
result = jmespath.search(jmespath_expr, json.loads(source), options=options)
result
制作中
[{'db': 'db_a', 'name': 'alice'},
{'db': 'db_a', 'name': 'alex'},
{'db': 'db_b', 'name': 'bob'},
{'db': 'db_b', 'name': 'brienne'}]