使用 JQ - 加入 2 json 个结构不同但具有公共密钥的文件

Using JQ - Join 2 json files which do not have the same structure but with a common key

我有 2 个文件要根据公用密钥加入。

第一个文件是一个数组:

{"_time": "2022-02-20T23","csp_name": "1","tool_bf_id": "1234", "dvc_ssr": "aa-1111"}
{"_time": "2022-02-20T23","csp_name": "2","tool_bf_id": "4567", "dvc_ssr": "aa-2222"}
{"_time": "2022-02-20T23","csp_name": "3","tool_bf_id": "1357", "dvc_ssr": "null"}
{"_time": "2022-02-20T23","csp_name": "4","tool_bf_id": "2468", "dvc_ssr": "aa-1111"}
{"_time": "2022-02-20T23","csp_name": "5","tool_bf_id": "1246", "dvc_ssr": "null"}

第二个文件也是一个数组,完整列出所有不同的“dvc_ssr”与:

{"host": "hostId1","dvc_ssr": "aa-1111"}
{"host": "hostId2","dvc_ssr": "aa-2222"}

我正在尝试通过使用 dvc_ssr 键值将第二个文件中的信息添加到第一个文件中来加入这两个文件。

我期待这样的事情:

{"_time": "2022-02-20T23","csp_name": "1","tool_bf_id": "1234", "dvc_ssr": "aa-1111","host": "hostId1"}
{"_time": "2022-02-20T23","csp_name": "2",,"tool_bf_id": "4567", "dvc_ssr": "aa-2222","host": "hostId2"}
{"_time": "2022-02-20T23","csp_name": "3","tool_bf_id": "1357", "dvc_ssr": "null"}
{"_time": "2022-02-20T23","csp_name": "4","tool_bf_id": "2468", "dvc_ssr": "aa-1111","host": "hostId1"}
{"_time": "2022-02-20T23","csp_name": "5","tool_bf_id": "1246", "dvc_ssr": "null"}

经过一些研究,我找到了使用 flatten/group_by/add ..

的想法

flatten | group_by(.dvc_ssr) | map(reduce .[] as $x ({}; . * $x))

[.[1] + . [0] | group_by(.dvc_ssr) [] | add]

但这行不通。 问题是通过对它们进行分组,我将“null”和具有相同“dvc_ssr”的那些分组。最后我丢失了一些记录。

我能够使用 jq -s (--slurp) 加入文件并尝试对数组进行分组

jq -s '[.[0] + .[1] | group_by(.dvc_ssr) []]' file1.json file2.json

然后我可以从第二个文件中删除未使用的记录,方法是: |map(select(if ._time!=null then . else empty end))| .[]

真正的想法是像 SQL 那样进行 JOIN,其中“dvc_ssr”是相同的。

使用 JOIN 的所有参数,你会做

jq --slurpfile fst first.json --slurpfile snd second.json -nc '
  JOIN(INDEX($snd[]; .dvc_ssr); $fst[]; .dvc_ssr; add)
'
{"_time":"2022-02-20T23","csp_name":"1","tool_bf_id":"1234","dvc_ssr":"aa-1111","host":"hostId1"}
{"_time":"2022-02-20T23","csp_name":"2","tool_bf_id":"4567","dvc_ssr":"aa-2222","host":"hostId2"}
{"_time":"2022-02-20T23","csp_name":"3","tool_bf_id":"1357","dvc_ssr":"null"}
{"_time":"2022-02-20T23","csp_name":"4","tool_bf_id":"2468","dvc_ssr":"aa-1111","host":"hostId1"}
{"_time":"2022-02-20T23","csp_name":"5","tool_bf_id":"1246","dvc_ssr":"null"}

根据您的应用程序上下文,您可能会切断其中的一些。例如,只有两个参数的 JOIN 接受流作为输入,并允许您在 JOIN.

之外处理连接

相同示例(注意 -s 而不是 -n,以及 JOIN(…)[] 而不是 JOIN(…)

jq --slurpfile snd second.json -sc '
  JOIN(INDEX($snd[]; .dvc_ssr); .dvc_ssr)[] | add
' first.json