Kusto KQL:如何检查数据集中的 JSON 数组是否包含另一个数组的元素?

Kusto KQL: how to check if JSON array in dataset contains element of another array?

// Verbs to look for (actual list is longer).
let verbs = datatable (verb : string) [
"discover",
"gain"
];

// Data. Second column is a JSON string.
let data = datatable(id : int, json: string) [
1, "[\"Discover me\", \"some text\"]",
2, "[\"All good\", \"no invalid verbs\"]",
3, "[\"first element fine\", \"gain power isn't ok\"]",
];

// Query: I need to know if at least one of the items in the "json" column starts
// with one of the verbs of the "verbs" list.
data
| extend  parsedJson = parse_json(json)
| extend OneOrMoreListItemsHaveVerb = false
| project id, OneOrMoreListItemsHaveVerb

我尝试使用 mv_apply() 但失败了,因为我正在处理两个 lists/arrays 相互比较,而不是一个数组和一个项目。

对于上面的示例数据,我希望返回 ID 为 1 和 3 的项目。项目 1 的第一个元素以“发现”开头,项目 3 的第二个元素以“获得”开头。

您可以根据输入 table 创建一个数组(例如使用 summarize make_set()),然后使用 mv-apply 循环遍历每个输入。

例如:

let verbs = datatable (verb: string) [
    "discover", "gain"
]
;
let verbs_list = toscalar(verbs | summarize make_set(verb))
;
let data = datatable(id: int, json: string) [
    1, "[\"Discover me\", \"some text\"]",
    2, "[\"All good\", \"no invalid verbs\"]",
    3, "[\"first element fine\", \"gain power isn't ok\"]",
]
;
data
| mv-apply verb = verbs_list on (
    mv-apply input = parse_json(json) on (
        where input startswith verb
    )
)
| project ['id'], json
id json
1 ["Discover me", "some text"]
3 ["first element fine", "gain power isn't ok"]

或者,您可以使用 partition 运算符实现类似的逻辑:

let verbs = datatable (verb: string) [
    "discover", "gain"
]
;
let data = datatable(id: int, json: string) [
    1, "[\"Discover me\", \"some text\"]",
    2, "[\"All good\", \"no invalid verbs\"]",
    3, "[\"first element fine\", \"gain power isn't ok\"]",
]
;
verbs
| partition by verb
{
    data
    | mv-apply input = parse_json(json) on(
        where input startswith verb
    )
    | project ['id'], json
}
id json
1 ["Discover me", "some text"]
3 ["first element fine", "gain power isn't ok"]