Kusto 查询 - Return 每天前 5 个类别
Kusto query - Return top 5 per day by category
我正在尝试按“headsection”和“day”计算“name”的每次出现次数
假设我有以下 table 结构(一小段):
Timestamp
Headsection
Name
01/01/2021
1
A
01/01/2021
2
AA
01/01/2021
3
AAA
01/01/2021
1
B
01/01/2021
2
BB
01/01/2021
3
BBB
01/01/2021
1
C
01/01/2021
2
CC
01/01/2021
3
CCC
01/01/2021
1
D
01/01/2021
2
CC
01/01/2021
3
DDD
01/01/2021
1
E
01/01/2021
2
DD
01/01/2021
3
EEE
01/01/2021
1
A
01/01/2021
2
EE
01/01/2021
3
DDD
本质上,我想按headsection统计每天排名前5的名字
所以有 3 个标题 - 一年中的每一天都应该包含 15 行
为了形象化,我想总结一下table:
Timestamp
Headsection
Name
Name_count
01/01/2021
1
A
2
01/01/2021
1
B
1
01/01/2021
1
C
1
01/01/2021
1
D
1
01/01/2021
1
E
1
01/01/2021
2
CC
2
01/01/2021
2
AA
1
01/01/2021
2
BB
1
01/01/2021
2
DD
1
01/01/2021
2
EE
1
01/01/2021
3
DDD
2
01/01/2021
3
AAA
1
01/01/2021
3
BBB
1
01/01/2021
3
CCC
1
01/01/2021
3
EEE
1
我已将查询设置为
|where timestamp between (startofday(datetime(2021-01-01)) .. endofday(now()))
这意味着查询应该能够将到目前为止的每一天的输入 table 转换为输出 table。
例如,以下 15 行应该是 01/02/2021(1 月 2 日),前 5 个“名字”那 天按标题。
我几乎是 KQL 的新手,所以我真的需要一些帮助!
我尝试过 top-nested 和汇总运算符,但似乎无法正常工作。
这应该可以解决问题:
let NumItemsByDayAndHeadsection = 5;
datatable(Timestamp:datetime, Headsection:long, Name:string) [
datetime(2021-01-01), 1, "A",
datetime(2021-01-01), 2, "AA",
datetime(2021-01-01), 3, "AAA",
datetime(2021-01-01), 1, "B",
datetime(2021-01-01), 2, "BB",
datetime(2021-01-01), 3, "BBB",
datetime(2021-01-01), 1, "C",
datetime(2021-01-01), 2, "CC",
datetime(2021-01-01), 3, "CCC",
datetime(2021-01-01), 1, "D",
datetime(2021-01-01), 2, "DD",
datetime(2021-01-01), 3, "DDD",
datetime(2021-01-01), 1, "E",
datetime(2021-01-01), 2, "EE",
datetime(2021-01-01), 3, "EEE",
datetime(2021-01-01), 1, "A",
datetime(2021-01-01), 2, "EE",
datetime(2021-01-01), 3, "DDD"
]
| summarize NameCount = count() by Timestamp, Headsection, Name
| order by Headsection asc, NameCount desc
| summarize make_list(Timestamp, NumItemsByDayAndHeadsection), make_list(Name, NumItemsByDayAndHeadsection), make_list(NameCount, NumItemsByDayAndHeadsection) by Timestamp, Headsection
| mv-expand list_Timestamp, list_Name, list_NameCount
| project Timestamp, Headsection, Name = list_Name, NameCount = list_NameCount
输出:
Timestamp
Headsection
Name
NameCount
2021-01-01 00:00:00.0000000
1
A
2
2021-01-01 00:00:00.0000000
1
B
1
2021-01-01 00:00:00.0000000
1
C
1
2021-01-01 00:00:00.0000000
1
D
1
2021-01-01 00:00:00.0000000
1
E
1
2021-01-01 00:00:00.0000000
2
EE
2
2021-01-01 00:00:00.0000000
2
AA
1
2021-01-01 00:00:00.0000000
2
BB
1
2021-01-01 00:00:00.0000000
2
CC
1
2021-01-01 00:00:00.0000000
2
DD
1
2021-01-01 00:00:00.0000000
3
DDD
2
2021-01-01 00:00:00.0000000
3
AAA
1
2021-01-01 00:00:00.0000000
3
BBB
1
2021-01-01 00:00:00.0000000
3
CCC
1
2021-01-01 00:00:00.0000000
3
EEE
1
另一种方法是使用专门为此创建的 top-nested 运算符。
let NumItemsByDayAndHeadsection = 5;
datatable(Timestamp: datetime, Headsection: long, Name: string) [
datetime(2021-01-01), 1, "A",
datetime(2021-01-01), 2, "AA",
datetime(2021-01-01), 3, "AAA",
datetime(2021-01-01), 1, "B",
datetime(2021-01-01), 2, "BB",
datetime(2021-01-01), 3, "BBB",
datetime(2021-01-01), 1, "C",
datetime(2021-01-01), 2, "CC",
datetime(2021-01-01), 3, "CCC",
datetime(2021-01-01), 1, "D",
datetime(2021-01-01), 2, "DD",
datetime(2021-01-01), 3, "DDD",
datetime(2021-01-01), 1, "E",
datetime(2021-01-01), 2, "EE",
datetime(2021-01-01), 3, "EEE",
datetime(2021-01-01), 1, "A",
datetime(2021-01-01), 2, "EE",
datetime(2021-01-01), 3, "DDD"
]
| top-nested of Timestamp by Dummy=max(1),
top-nested of Headsection by Dummy2=max(1),
top-nested 5 of Name by Count=count()
| project Timestamp, Headsection, Name, Count
我正在尝试按“headsection”和“day”计算“name”的每次出现次数
假设我有以下 table 结构(一小段):
Timestamp | Headsection | Name |
---|---|---|
01/01/2021 | 1 | A |
01/01/2021 | 2 | AA |
01/01/2021 | 3 | AAA |
01/01/2021 | 1 | B |
01/01/2021 | 2 | BB |
01/01/2021 | 3 | BBB |
01/01/2021 | 1 | C |
01/01/2021 | 2 | CC |
01/01/2021 | 3 | CCC |
01/01/2021 | 1 | D |
01/01/2021 | 2 | CC |
01/01/2021 | 3 | DDD |
01/01/2021 | 1 | E |
01/01/2021 | 2 | DD |
01/01/2021 | 3 | EEE |
01/01/2021 | 1 | A |
01/01/2021 | 2 | EE |
01/01/2021 | 3 | DDD |
本质上,我想按headsection统计每天排名前5的名字
所以有 3 个标题 - 一年中的每一天都应该包含 15 行
为了形象化,我想总结一下table:
Timestamp | Headsection | Name | Name_count |
---|---|---|---|
01/01/2021 | 1 | A | 2 |
01/01/2021 | 1 | B | 1 |
01/01/2021 | 1 | C | 1 |
01/01/2021 | 1 | D | 1 |
01/01/2021 | 1 | E | 1 |
01/01/2021 | 2 | CC | 2 |
01/01/2021 | 2 | AA | 1 |
01/01/2021 | 2 | BB | 1 |
01/01/2021 | 2 | DD | 1 |
01/01/2021 | 2 | EE | 1 |
01/01/2021 | 3 | DDD | 2 |
01/01/2021 | 3 | AAA | 1 |
01/01/2021 | 3 | BBB | 1 |
01/01/2021 | 3 | CCC | 1 |
01/01/2021 | 3 | EEE | 1 |
我已将查询设置为
|where timestamp between (startofday(datetime(2021-01-01)) .. endofday(now()))
这意味着查询应该能够将到目前为止的每一天的输入 table 转换为输出 table。
例如,以下 15 行应该是 01/02/2021(1 月 2 日),前 5 个“名字”那 天按标题。
我几乎是 KQL 的新手,所以我真的需要一些帮助!
我尝试过 top-nested 和汇总运算符,但似乎无法正常工作。
这应该可以解决问题:
let NumItemsByDayAndHeadsection = 5;
datatable(Timestamp:datetime, Headsection:long, Name:string) [
datetime(2021-01-01), 1, "A",
datetime(2021-01-01), 2, "AA",
datetime(2021-01-01), 3, "AAA",
datetime(2021-01-01), 1, "B",
datetime(2021-01-01), 2, "BB",
datetime(2021-01-01), 3, "BBB",
datetime(2021-01-01), 1, "C",
datetime(2021-01-01), 2, "CC",
datetime(2021-01-01), 3, "CCC",
datetime(2021-01-01), 1, "D",
datetime(2021-01-01), 2, "DD",
datetime(2021-01-01), 3, "DDD",
datetime(2021-01-01), 1, "E",
datetime(2021-01-01), 2, "EE",
datetime(2021-01-01), 3, "EEE",
datetime(2021-01-01), 1, "A",
datetime(2021-01-01), 2, "EE",
datetime(2021-01-01), 3, "DDD"
]
| summarize NameCount = count() by Timestamp, Headsection, Name
| order by Headsection asc, NameCount desc
| summarize make_list(Timestamp, NumItemsByDayAndHeadsection), make_list(Name, NumItemsByDayAndHeadsection), make_list(NameCount, NumItemsByDayAndHeadsection) by Timestamp, Headsection
| mv-expand list_Timestamp, list_Name, list_NameCount
| project Timestamp, Headsection, Name = list_Name, NameCount = list_NameCount
输出:
Timestamp | Headsection | Name | NameCount |
---|---|---|---|
2021-01-01 00:00:00.0000000 | 1 | A | 2 |
2021-01-01 00:00:00.0000000 | 1 | B | 1 |
2021-01-01 00:00:00.0000000 | 1 | C | 1 |
2021-01-01 00:00:00.0000000 | 1 | D | 1 |
2021-01-01 00:00:00.0000000 | 1 | E | 1 |
2021-01-01 00:00:00.0000000 | 2 | EE | 2 |
2021-01-01 00:00:00.0000000 | 2 | AA | 1 |
2021-01-01 00:00:00.0000000 | 2 | BB | 1 |
2021-01-01 00:00:00.0000000 | 2 | CC | 1 |
2021-01-01 00:00:00.0000000 | 2 | DD | 1 |
2021-01-01 00:00:00.0000000 | 3 | DDD | 2 |
2021-01-01 00:00:00.0000000 | 3 | AAA | 1 |
2021-01-01 00:00:00.0000000 | 3 | BBB | 1 |
2021-01-01 00:00:00.0000000 | 3 | CCC | 1 |
2021-01-01 00:00:00.0000000 | 3 | EEE | 1 |
另一种方法是使用专门为此创建的 top-nested 运算符。
let NumItemsByDayAndHeadsection = 5;
datatable(Timestamp: datetime, Headsection: long, Name: string) [
datetime(2021-01-01), 1, "A",
datetime(2021-01-01), 2, "AA",
datetime(2021-01-01), 3, "AAA",
datetime(2021-01-01), 1, "B",
datetime(2021-01-01), 2, "BB",
datetime(2021-01-01), 3, "BBB",
datetime(2021-01-01), 1, "C",
datetime(2021-01-01), 2, "CC",
datetime(2021-01-01), 3, "CCC",
datetime(2021-01-01), 1, "D",
datetime(2021-01-01), 2, "DD",
datetime(2021-01-01), 3, "DDD",
datetime(2021-01-01), 1, "E",
datetime(2021-01-01), 2, "EE",
datetime(2021-01-01), 3, "EEE",
datetime(2021-01-01), 1, "A",
datetime(2021-01-01), 2, "EE",
datetime(2021-01-01), 3, "DDD"
]
| top-nested of Timestamp by Dummy=max(1),
top-nested of Headsection by Dummy2=max(1),
top-nested 5 of Name by Count=count()
| project Timestamp, Headsection, Name, Count