Kusto KQL:如何检查数据集中的 JSON 数组是否包含另一个数组的元素?
Kusto KQL: how to check if JSON array in dataset contains element of another array?
- 我正在查询的数据集 (table) 有一列包含 JSON 字符串数组。
- 我有一个固定的动词列表,我需要检查 table 中的每个条目并找到那些,其中 JSON 列表中的至少一个项目以以下之一开头固定列表中的动词。
// Verbs to look for (actual list is longer).
let verbs = datatable (verb : string) [
"discover",
"gain"
];
// Data. Second column is a JSON string.
let data = datatable(id : int, json: string) [
1, "[\"Discover me\", \"some text\"]",
2, "[\"All good\", \"no invalid verbs\"]",
3, "[\"first element fine\", \"gain power isn't ok\"]",
];
// Query: I need to know if at least one of the items in the "json" column starts
// with one of the verbs of the "verbs" list.
data
| extend parsedJson = parse_json(json)
| extend OneOrMoreListItemsHaveVerb = false
| project id, OneOrMoreListItemsHaveVerb
我尝试使用 mv_apply()
但失败了,因为我正在处理两个 lists/arrays 相互比较,而不是一个数组和一个项目。
对于上面的示例数据,我希望返回 ID 为 1 和 3 的项目。项目 1 的第一个元素以“发现”开头,项目 3 的第二个元素以“获得”开头。
您可以根据输入 table 创建一个数组(例如使用 summarize make_set()
),然后使用 mv-apply
循环遍历每个输入。
例如:
let verbs = datatable (verb: string) [
"discover", "gain"
]
;
let verbs_list = toscalar(verbs | summarize make_set(verb))
;
let data = datatable(id: int, json: string) [
1, "[\"Discover me\", \"some text\"]",
2, "[\"All good\", \"no invalid verbs\"]",
3, "[\"first element fine\", \"gain power isn't ok\"]",
]
;
data
| mv-apply verb = verbs_list on (
mv-apply input = parse_json(json) on (
where input startswith verb
)
)
| project ['id'], json
id
json
1
["Discover me", "some text"]
3
["first element fine", "gain power isn't ok"]
或者,您可以使用 partition
运算符实现类似的逻辑:
let verbs = datatable (verb: string) [
"discover", "gain"
]
;
let data = datatable(id: int, json: string) [
1, "[\"Discover me\", \"some text\"]",
2, "[\"All good\", \"no invalid verbs\"]",
3, "[\"first element fine\", \"gain power isn't ok\"]",
]
;
verbs
| partition by verb
{
data
| mv-apply input = parse_json(json) on(
where input startswith verb
)
| project ['id'], json
}
id
json
1
["Discover me", "some text"]
3
["first element fine", "gain power isn't ok"]
- 我正在查询的数据集 (table) 有一列包含 JSON 字符串数组。
- 我有一个固定的动词列表,我需要检查 table 中的每个条目并找到那些,其中 JSON 列表中的至少一个项目以以下之一开头固定列表中的动词。
// Verbs to look for (actual list is longer).
let verbs = datatable (verb : string) [
"discover",
"gain"
];
// Data. Second column is a JSON string.
let data = datatable(id : int, json: string) [
1, "[\"Discover me\", \"some text\"]",
2, "[\"All good\", \"no invalid verbs\"]",
3, "[\"first element fine\", \"gain power isn't ok\"]",
];
// Query: I need to know if at least one of the items in the "json" column starts
// with one of the verbs of the "verbs" list.
data
| extend parsedJson = parse_json(json)
| extend OneOrMoreListItemsHaveVerb = false
| project id, OneOrMoreListItemsHaveVerb
我尝试使用 mv_apply()
但失败了,因为我正在处理两个 lists/arrays 相互比较,而不是一个数组和一个项目。
对于上面的示例数据,我希望返回 ID 为 1 和 3 的项目。项目 1 的第一个元素以“发现”开头,项目 3 的第二个元素以“获得”开头。
您可以根据输入 table 创建一个数组(例如使用 summarize make_set()
),然后使用 mv-apply
循环遍历每个输入。
例如:
let verbs = datatable (verb: string) [
"discover", "gain"
]
;
let verbs_list = toscalar(verbs | summarize make_set(verb))
;
let data = datatable(id: int, json: string) [
1, "[\"Discover me\", \"some text\"]",
2, "[\"All good\", \"no invalid verbs\"]",
3, "[\"first element fine\", \"gain power isn't ok\"]",
]
;
data
| mv-apply verb = verbs_list on (
mv-apply input = parse_json(json) on (
where input startswith verb
)
)
| project ['id'], json
id | json |
---|---|
1 | ["Discover me", "some text"] |
3 | ["first element fine", "gain power isn't ok"] |
或者,您可以使用 partition
运算符实现类似的逻辑:
let verbs = datatable (verb: string) [
"discover", "gain"
]
;
let data = datatable(id: int, json: string) [
1, "[\"Discover me\", \"some text\"]",
2, "[\"All good\", \"no invalid verbs\"]",
3, "[\"first element fine\", \"gain power isn't ok\"]",
]
;
verbs
| partition by verb
{
data
| mv-apply input = parse_json(json) on(
where input startswith verb
)
| project ['id'], json
}
id | json |
---|---|
1 | ["Discover me", "some text"] |
3 | ["first element fine", "gain power isn't ok"] |