Azure Data Explorer(kusto)如何按天汇总并与“其他”放在首位?
Azure Data Explorer (kusto) how to summarize by day and top with “others”?
我是 Kusto 语言的新手 – 请帮助我创建查询。
此处数据集:
let T = datatable(d:datetime , s:string)
[
datetime(2019-10-01T00:01:00.00), "A",
datetime(2019-10-01T00:02:00.00), "A",
datetime(2019-10-01T00:03:00.00), "A",
datetime(2019-10-02T00:01:00.00), "A",
datetime(2019-10-02T00:02:00.00), "A",
datetime(2019-10-02T00:03:00.00), "A",
datetime(2019-10-01T00:01:00.00), "C",
datetime(2019-10-01T00:02:00.00), "C",
datetime(2019-10-02T00:01:00.00), "C",
datetime(2019-10-02T00:02:00.00), "C",
datetime(2019-10-01T00:01:00.00), "D",
datetime(2019-10-02T00:01:00.00), "D",
datetime(2019-10-01T00:01:00.00), "E",
datetime(2019-10-02T00:01:00.00), "E",
];
我希望与其他人一起获得每个“s”字符串的前 2 个,并按天汇总。
IE。结果需要是:
2019-10-01T00:00:00Z A 3
2019-10-01T00:00:00Z C 2
2019-10-01T00:00:00Z Other 2
2019-10-02T00:00:00Z A 3
2019-10-02T00:00:00Z C 2
2019-10-02T00:00:00Z Other 2
我想我以查询结束:
T
| summarize c = count() by bin(d, 1d), s
| top-nested of d by dummy0 = max(0)
| top-nested 2 of s with others = "Other" by c0 = sum(c);
但是不行。
请指教。
找到了自己的方法来完成它。不确定它是如何优化的。
let Q=T
|top-nested 2 of s with others = "" by c = count()
| project sq = s;
T
| join kind=leftouter Q on $left.s==$right.sq
| summarize c = count() by bin(d, 1d), s=sq
| project d,s=iif(isempty(s),"Other",s),c
这是一种使用顶层嵌套执行此操作的方法,其性能应该比您的建议更好:
let T = datatable(d:datetime , s:string)
[
datetime(2019-10-01T00:01:00.00), "A",
datetime(2019-10-01T00:02:00.00), "A",
datetime(2019-10-01T00:03:00.00), "A",
datetime(2019-10-02T00:01:00.00), "A",
datetime(2019-10-02T00:02:00.00), "A",
datetime(2019-10-02T00:03:00.00), "A",
datetime(2019-10-01T00:01:00.00), "C",
datetime(2019-10-01T00:02:00.00), "C",
datetime(2019-10-02T00:01:00.00), "C",
datetime(2019-10-02T00:02:00.00), "C",
datetime(2019-10-01T00:01:00.00), "D",
datetime(2019-10-02T00:01:00.00), "D",
datetime(2019-10-01T00:01:00.00), "E",
datetime(2019-10-02T00:01:00.00), "E",
];
T
| summarize c = count() by bin(d, 1d), s
| top-nested of d by dummy=max(0), top-nested 2 of s with others = "Others" by _count = sum(c)
| where _count > 0 | project-away dummy
我是 Kusto 语言的新手 – 请帮助我创建查询。
此处数据集:
let T = datatable(d:datetime , s:string)
[
datetime(2019-10-01T00:01:00.00), "A",
datetime(2019-10-01T00:02:00.00), "A",
datetime(2019-10-01T00:03:00.00), "A",
datetime(2019-10-02T00:01:00.00), "A",
datetime(2019-10-02T00:02:00.00), "A",
datetime(2019-10-02T00:03:00.00), "A",
datetime(2019-10-01T00:01:00.00), "C",
datetime(2019-10-01T00:02:00.00), "C",
datetime(2019-10-02T00:01:00.00), "C",
datetime(2019-10-02T00:02:00.00), "C",
datetime(2019-10-01T00:01:00.00), "D",
datetime(2019-10-02T00:01:00.00), "D",
datetime(2019-10-01T00:01:00.00), "E",
datetime(2019-10-02T00:01:00.00), "E",
];
我希望与其他人一起获得每个“s”字符串的前 2 个,并按天汇总。 IE。结果需要是:
2019-10-01T00:00:00Z A 3
2019-10-01T00:00:00Z C 2
2019-10-01T00:00:00Z Other 2
2019-10-02T00:00:00Z A 3
2019-10-02T00:00:00Z C 2
2019-10-02T00:00:00Z Other 2
我想我以查询结束:
T
| summarize c = count() by bin(d, 1d), s
| top-nested of d by dummy0 = max(0)
| top-nested 2 of s with others = "Other" by c0 = sum(c);
但是不行。
请指教。
找到了自己的方法来完成它。不确定它是如何优化的。
let Q=T
|top-nested 2 of s with others = "" by c = count()
| project sq = s;
T
| join kind=leftouter Q on $left.s==$right.sq
| summarize c = count() by bin(d, 1d), s=sq
| project d,s=iif(isempty(s),"Other",s),c
这是一种使用顶层嵌套执行此操作的方法,其性能应该比您的建议更好:
let T = datatable(d:datetime , s:string)
[
datetime(2019-10-01T00:01:00.00), "A",
datetime(2019-10-01T00:02:00.00), "A",
datetime(2019-10-01T00:03:00.00), "A",
datetime(2019-10-02T00:01:00.00), "A",
datetime(2019-10-02T00:02:00.00), "A",
datetime(2019-10-02T00:03:00.00), "A",
datetime(2019-10-01T00:01:00.00), "C",
datetime(2019-10-01T00:02:00.00), "C",
datetime(2019-10-02T00:01:00.00), "C",
datetime(2019-10-02T00:02:00.00), "C",
datetime(2019-10-01T00:01:00.00), "D",
datetime(2019-10-02T00:01:00.00), "D",
datetime(2019-10-01T00:01:00.00), "E",
datetime(2019-10-02T00:01:00.00), "E",
];
T
| summarize c = count() by bin(d, 1d), s
| top-nested of d by dummy=max(0), top-nested 2 of s with others = "Others" by _count = sum(c)
| where _count > 0 | project-away dummy