从子数组中提取值并使用 jq 与上值结合到 csv
extract value from subarray and combine with upper value using jq to csv
我有一个带有子数组的 json 文件,我想在不同的行中获取具有相同 ID 的每个子数组。
json:
{
"success": true,
"status": {
"http": {
"code": 200,
"message": "OK"
}
},
"result": [{
"id": "123456789",
"start_date": "2021-01-01 08:17:39.989",
"snippets": [{
"transcript": "yes",
"matched_entry": null,
"start": "2021-01-16 11:32:25.922"
}, {
"transcript": null,
"matched_entry": null,
"start": "2021-01-16 11:32:38.179"
}]
}, {
"id": "987654321",
"start_date": "2021-01-01 08:17:39.989",
"duration_total": 301,
"snippets": [{
"transcript": "yes",
"matched_entry": null,
"start": "2021-01-01 08:17:54.055"
}, {
"transcript": "something",
"matched_entry": " meta entry",
"start": "2021-01-01 08:18:11.028"
}, {
"transcript": "no",
"matched_entry": null,
"start": "2021-01-01 08:18:24.057"
}]
}]
}
我尝试得到:
123456789, yes , null, "2021-01-16 11:32:25.922"
123456789, null, null, "2021-01-16 11:32:38.179"
987654321, yes, null, "2021-01-01 08:17:54.055"
987654321, something, "meta entry", "2021-01-01 08:18:11.028"
987654321, no, null, "2021-01-01 08:18:24.057"
第一次尝试是:
jq -rc ".result[] | {id: .id, snippetsTranscript: .snippets[].transcript, snippetsMatchedEntry: .snippets[].matched_entry, snippetsStart: .snippets[].start}" 210101_210121_copy.json
但结果是返回了每个组合:
{"id":"123456789","snippetsTranscript":"yes","snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:25.922"}
{"id":"123456789","snippetsTranscript":"yes","snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:38.179"}
{"id":"123456789","snippetsTranscript":"yes","snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:25.922"}
{"id":"123456789","snippetsTranscript":"yes","snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:38.179"}
{"id":"123456789","snippetsTranscript":null,"snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:25.922"}
{"id":"123456789","snippetsTranscript":null,"snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:38.179"}
{"id":"123456789","snippetsTranscript":null,"snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:25.922"}
{"id":"123456789","snippetsTranscript":null,"snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:38.179"} ...
第二次尝试是:
jq -rc ".result[] | {id: .id, snippetsMatchedEntry: [.snippets[].matched_entry], snippetsStart: [.snippets[].start], snippetsTranscript: [.snippets[].transcript]}" 210901_210921_copy.json
但结果是:
{"id":"123456789","snippetsMatchedEntry":[null,null],"snippetsStart":["2021-01-16 11:32:25.922","2021-01-16 11:32:38.179"],"snippetsTranscript":["yes",null]}
{"id":"987654321","snippetsMatchedEntry":[null," meta entry",null],"snippetsStart":["2021-01-01 08:17:54.055","2021-01-01 08:18:11.028","2021-01-01 08:18:24.057"],"snippetsTranscript":["yes","something","no"]}
jq 可以吗?
对于 result
数组的每个元素,创建一个仅包含 id
字段的对象,并为 snippets
子数组的每个元素添加:
.result[] | {id} + .snippets[]
如果您不需要 snippets
数组的所有字段,只需像以前一样声明它们
.result[] | {id} + (.snippets[] | {transcipt, matched_entry, start})
上试用
我有一个带有子数组的 json 文件,我想在不同的行中获取具有相同 ID 的每个子数组。 json:
{
"success": true,
"status": {
"http": {
"code": 200,
"message": "OK"
}
},
"result": [{
"id": "123456789",
"start_date": "2021-01-01 08:17:39.989",
"snippets": [{
"transcript": "yes",
"matched_entry": null,
"start": "2021-01-16 11:32:25.922"
}, {
"transcript": null,
"matched_entry": null,
"start": "2021-01-16 11:32:38.179"
}]
}, {
"id": "987654321",
"start_date": "2021-01-01 08:17:39.989",
"duration_total": 301,
"snippets": [{
"transcript": "yes",
"matched_entry": null,
"start": "2021-01-01 08:17:54.055"
}, {
"transcript": "something",
"matched_entry": " meta entry",
"start": "2021-01-01 08:18:11.028"
}, {
"transcript": "no",
"matched_entry": null,
"start": "2021-01-01 08:18:24.057"
}]
}]
}
我尝试得到:
123456789, yes , null, "2021-01-16 11:32:25.922"
123456789, null, null, "2021-01-16 11:32:38.179"
987654321, yes, null, "2021-01-01 08:17:54.055"
987654321, something, "meta entry", "2021-01-01 08:18:11.028"
987654321, no, null, "2021-01-01 08:18:24.057"
第一次尝试是:
jq -rc ".result[] | {id: .id, snippetsTranscript: .snippets[].transcript, snippetsMatchedEntry: .snippets[].matched_entry, snippetsStart: .snippets[].start}" 210101_210121_copy.json
但结果是返回了每个组合:
{"id":"123456789","snippetsTranscript":"yes","snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:25.922"}
{"id":"123456789","snippetsTranscript":"yes","snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:38.179"}
{"id":"123456789","snippetsTranscript":"yes","snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:25.922"}
{"id":"123456789","snippetsTranscript":"yes","snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:38.179"}
{"id":"123456789","snippetsTranscript":null,"snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:25.922"}
{"id":"123456789","snippetsTranscript":null,"snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:38.179"}
{"id":"123456789","snippetsTranscript":null,"snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:25.922"}
{"id":"123456789","snippetsTranscript":null,"snippetsMatchedEntry":null,"snippetsStart":"2021-01-16 11:32:38.179"} ...
第二次尝试是:
jq -rc ".result[] | {id: .id, snippetsMatchedEntry: [.snippets[].matched_entry], snippetsStart: [.snippets[].start], snippetsTranscript: [.snippets[].transcript]}" 210901_210921_copy.json
但结果是:
{"id":"123456789","snippetsMatchedEntry":[null,null],"snippetsStart":["2021-01-16 11:32:25.922","2021-01-16 11:32:38.179"],"snippetsTranscript":["yes",null]}
{"id":"987654321","snippetsMatchedEntry":[null," meta entry",null],"snippetsStart":["2021-01-01 08:17:54.055","2021-01-01 08:18:11.028","2021-01-01 08:18:24.057"],"snippetsTranscript":["yes","something","no"]}
jq 可以吗?
对于 result
数组的每个元素,创建一个仅包含 id
字段的对象,并为 snippets
子数组的每个元素添加:
.result[] | {id} + .snippets[]
如果您不需要 snippets
数组的所有字段,只需像以前一样声明它们
.result[] | {id} + (.snippets[] | {transcipt, matched_entry, start})
上试用