Kusto：昂贵计算产生的中间结果的并集

Question

我有一个昂贵的查询，需要大量 CPU 和内存才能生成结果。但是，结果数据集只包含有限的行数。

let result = expensive_function()
    | summarize A=xxx, B=xxx by X, Y, Z;

我想追加根据结果进一步总结的另一行。例如，省略汇总键中的 Z 列，并为结果行设置 Z="ALL"。

result
| union (
    result
    | summarize A=XXX, B=XXX by X, Y
    | extend Z="ALL"
)

执行时，Kusto 似乎会在 union 运算符中扩展并并行执行 expensive_function()，这会导致两次 CPU 和内存消耗。

我尝试将hint.concurrency=1添加到union运算符，这将减少峰值内存与单个结果查询相同，但是，执行时间将增加一倍。

我们可以给 Kusto 一个提示，我们需要冻结中间结果，并且所有后续查询都应该对冻结的中间结果进行操作，而不是从源计算吗？

Answer 1

使用 materialize() 函数：

let result = materialize(expensive_function()
    | summarize A=xxx, B=xxx by X, Y, Z);
result
| union (
    result
    | summarize A=XXX, B=XXX by X, Y
    | extend Z="ALL"
)

Kusto：昂贵计算产生的中间结果的并集

Kusto: union of intermediate result produced by expensive calculation

azure

kql

azure-data-explorer