Azure Kusto 查询使用动态数组作为类别生成具有空值的数据透视表

Azure Kusto Query to produce a pivot with null values using a dynamic array as categories

我希望能够使用 Azure 的 Kusto 语言从一些时间序列数据生成摘要报告。目标是能够生成 2 个不同时间段(最后一天和过去 3 天)的状态计数摘要,但无论所讨论的时间段是否具有特定状态的实例,两者都使用相同的类别.

示例数据:

╔════════════╦═══════╗
║    date    ║ state ║
╠════════════╬═══════╣
║ 01/01/2020 ║ On    ║
║ 01/01/2020 ║ Off   ║
║ 01/01/2020 ║ error ║
║ 01/01/2020 ║ Off   ║
║ 01/01/2020 ║ Off   ║
║ 01/01/2020 ║ error ║
║ 01/01/2020 ║ error ║
║ 01/01/2020 ║ On    ║
║ 02/01/2020 ║ Off   ║
║ 02/01/2020 ║ Off   ║
║ 02/01/2020 ║ Off   ║
║ 02/01/2020 ║ Off   ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ Off   ║
║ 02/01/2020 ║ error ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ Off   ║
║ 02/01/2020 ║ error ║
║ 02/01/2020 ║ error ║
║ 02/01/2020 ║ error ║
║ 02/01/2020 ║ error ║
║ 02/01/2020 ║ error ║
║ 02/01/2020 ║ error ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ On    ║
║ 02/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ error ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ error ║
║ 03/01/2020 ║ error ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ error ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ error ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ error ║
║ 03/01/2020 ║ error ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ Off   ║
║ 03/01/2020 ║ error ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ On    ║
║ 03/01/2020 ║ On    ║
║ 04/01/2020 ║ On    ║
║ 04/01/2020 ║ Off   ║
║ 04/01/2020 ║ Off   ║
║ 04/01/2020 ║ Off   ║
║ 04/01/2020 ║ Off   ║
║ 04/01/2020 ║ Off   ║
║ 04/01/2020 ║ Off   ║
║ 04/01/2020 ║ Off   ║
║ 04/01/2020 ║ Off   ║
║ 04/01/2020 ║ On    ║
║ 05/01/2020 ║ On    ║
║ 05/01/2020 ║ On    ║
║ 05/01/2020 ║ On    ║
║ 05/01/2020 ║ On    ║
╚════════════╩═══════╝

为了说明这一点,在 excel 几乎 中创建一个枢轴可以满足我的需要:

╔════════════╦═══════════════╗
║ Row Labels ║ Count of date ║
╠════════════╬═══════════════╣
║ 01/01/2020 ║             8 ║
║  error     ║             3 ║
║  Off       ║             3 ║
║  On        ║             2 ║
║ 02/01/2020 ║            25 ║
║  error     ║             7 ║
║  Off       ║             7 ║
║  On        ║            11 ║
║ 03/01/2020 ║            39 ║
║  error     ║             8 ║
║  Off       ║            21 ║
║  On        ║            10 ║
║ 04/01/2020 ║            10 ║
║  Off       ║             8 ║
║  On        ║             2 ║
║ 05/01/2020 ║             4 ║
║  On        ║             4 ║
╚════════════╩═══════════════╝

我需要 Kusto 查询做的是生成一个 table,如下所示:

╔════════════╦═══════════════╗
║ Row Labels ║ Count of date ║
╠════════════╬═══════════════╣
║ 01/01/2020 ║             8 ║
║  error     ║             3 ║
║  Off       ║             3 ║
║  On        ║             2 ║
║ 02/01/2020 ║            25 ║
║  error     ║             7 ║
║  Off       ║             7 ║
║  On        ║            11 ║
║ 03/01/2020 ║            39 ║
║  error     ║             8 ║
║  Off       ║            21 ║
║  On        ║            10 ║
║ 04/01/2020 ║            10 ║
║  **error** ║             0 ║
║  Off       ║             8 ║
║  On        ║             2 ║
║ 05/01/2020 ║             4 ║
║  **error** ║             0 ║
║  **Off**   ║             0 ║
║  On        ║             4 ║
╚════════════╩═══════════════╝

请注意 2020 年 4 月 1 日和 2020 年 1 月 1 日,这些日期没有出现的类别有 0 个值。

我尝试过使用 summarize,但无法弄清楚如何使用预设类别列表并在需要时默认为 0。

data
| summarize count(state) by bin(date, 1d), state

任何关于如何实现这一点的提示,将不胜感激。

如果您可以 "settle" 获得不同的输出模式(有人可能会争辩说,'convenient' 更适合使用),您可以尝试以下方法:

输出:

| dt                          | sum_count_ | On | Off | error |
|-----------------------------|------------|----|-----|-------|
| 2020-01-01 00:00:00.0000000 | 8          | 2  | 3   | 3     |
| 2020-02-01 00:00:00.0000000 | 25         | 11 | 7   | 7     |
| 2020-03-01 00:00:00.0000000 | 39         | 10 | 21  | 8     |
| 2020-04-01 00:00:00.0000000 | 10         | 2  | 8   |       |
| 2020-05-01 00:00:00.0000000 | 4          | 4  |     |       |

查询:

datatable(dt:datetime, state:string)
[
    datetime(01/01/2020), 'On',
    datetime(01/01/2020), 'Off',
    datetime(01/01/2020), 'error',
    datetime(01/01/2020), 'Off',
    datetime(01/01/2020), 'Off',
    datetime(01/01/2020), 'error',
    datetime(01/01/2020), 'error',
    datetime(01/01/2020), 'On',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'On',
    datetime(04/01/2020), 'On',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'On',
    datetime(05/01/2020), 'On',
    datetime(05/01/2020), 'On',
    datetime(05/01/2020), 'On',
    datetime(05/01/2020), 'On',
]
| summarize count() by dt, state
| summarize sum(count_), b = make_bag(pack(state, count_)) by dt
| evaluate bag_unpack(b)

如果您无法解决,但可以对 state 列的内容进行假设(例如,它的值是 OnOfferror) 那么你可以试试这个:

输出:

| RowLabel         | Count |
|------------------|-------|
| 01/01/2020 Total | 8     |
| 01/01/2020 Error | 3     |
| 01/01/2020 On    | 2     |
| 01/01/2020 Off   | 3     |
| 02/01/2020 Total | 25    |
| 02/01/2020 Error | 7     |
| 02/01/2020 On    | 11    |
| 02/01/2020 Off   | 7     |
| 03/01/2020 Total | 39    |
| 03/01/2020 Error | 8     |
| 03/01/2020 On    | 10    |
| 03/01/2020 Off   | 21    |
| 04/01/2020 Total | 10    |
| 04/01/2020 Error | 0     |
| 04/01/2020 On    | 2     |
| 04/01/2020 Off   | 8     |
| 05/01/2020 Total | 4     |
| 05/01/2020 Error | 0     |
| 05/01/2020 On    | 4     |
| 05/01/2020 Off   | 0     |

查询:

datatable(dt:datetime, state:string)
[
    datetime(01/01/2020), 'On',
    datetime(01/01/2020), 'Off',
    datetime(01/01/2020), 'error',
    datetime(01/01/2020), 'Off',
    datetime(01/01/2020), 'Off',
    datetime(01/01/2020), 'error',
    datetime(01/01/2020), 'error',
    datetime(01/01/2020), 'On',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'Off',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'error',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'On',
    datetime(02/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'Off',
    datetime(03/01/2020), 'error',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'On',
    datetime(03/01/2020), 'On',
    datetime(04/01/2020), 'On',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'Off',
    datetime(04/01/2020), 'On',
    datetime(05/01/2020), 'On',
    datetime(05/01/2020), 'On',
    datetime(05/01/2020), 'On',
    datetime(05/01/2020), 'On',
]
| summarize Total = count(),
            Error = countif(state == "error"),
            On = countif(state == "On"), 
            Off = countif(state == "Off") 
         by dt
| project dt, p = pack("Total", Total, "Error", Error, "On", On, "Off", Off)
| mv-apply p on (
    extend key = tostring(bag_keys(p)[0])
    | project RowLabel = strcat(format_datetime(dt, "MM/dd/yyyy"), " ", key),
              Count = p[key]
)
| project-away dt