如何从每天的结果的第 99 个百分位数中提取 ID
How to extract the ids from the 99th percentile of results for each day
我在 Kusto 中有一个数据集,我想收集每一天的 id
,其中 latency
大于当天的第 99 个百分位数。
例如我有一个 table 这样的:
id
latency
time
a
23.4
1/2/2022
c
1.4
1/2/2022
b
25.6
1/3/2022
d
2.3
1/3/2022
我希望结果 table 只是
time
id
1/2/2022
a
1/3/2022
b
其中 a
和 b
来自那几天延迟时间的第 99 个百分位数
我想从事的工作:
let 99p= T
| summarize percentiles(latency, 99) by bin(time, 1d);
T
| where startofday(time) == 99p.time
| where latency > 99p.latency
| project time, id
感谢您的任何见解,对 kusto 来说还很陌生
// This is not a part of the solution, only generation of a sample data set
let T = materialize (range id from 1 to 1000 step 1 | extend ['time'] = ago(rand()*10d), latency = round(rand()*100,2));
// The solution starts here
T | extend day = bin(['time'], 1d) | as T2
| summarize percentiles(latency, 99) by day
| join kind=inner T2 on day
| where latency >= percentile_latency_99
| summarize make_list(id) by day
day
list_id
2022-03-20T00:00:00Z
[298]
2022-03-21T00:00:00Z
[642]
2022-03-22T00:00:00Z
[200,504]
2022-03-23T00:00:00Z
[726,885]
2022-03-24T00:00:00Z
[590,975]
2022-03-25T00:00:00Z
[107]
2022-03-26T00:00:00Z
[836]
2022-03-27T00:00:00Z
[68]
2022-03-28T00:00:00Z
[527,859]
2022-03-29T00:00:00Z
[172,874]
2022-03-30T00:00:00Z
[274]
我在 Kusto 中有一个数据集,我想收集每一天的 id
,其中 latency
大于当天的第 99 个百分位数。
例如我有一个 table 这样的:
id | latency | time |
---|---|---|
a | 23.4 | 1/2/2022 |
c | 1.4 | 1/2/2022 |
b | 25.6 | 1/3/2022 |
d | 2.3 | 1/3/2022 |
我希望结果 table 只是
time | id |
---|---|
1/2/2022 | a |
1/3/2022 | b |
其中 a
和 b
来自那几天延迟时间的第 99 个百分位数
我想从事的工作:
let 99p= T
| summarize percentiles(latency, 99) by bin(time, 1d);
T
| where startofday(time) == 99p.time
| where latency > 99p.latency
| project time, id
感谢您的任何见解,对 kusto 来说还很陌生
// This is not a part of the solution, only generation of a sample data set
let T = materialize (range id from 1 to 1000 step 1 | extend ['time'] = ago(rand()*10d), latency = round(rand()*100,2));
// The solution starts here
T | extend day = bin(['time'], 1d) | as T2
| summarize percentiles(latency, 99) by day
| join kind=inner T2 on day
| where latency >= percentile_latency_99
| summarize make_list(id) by day
day | list_id |
---|---|
2022-03-20T00:00:00Z | [298] |
2022-03-21T00:00:00Z | [642] |
2022-03-22T00:00:00Z | [200,504] |
2022-03-23T00:00:00Z | [726,885] |
2022-03-24T00:00:00Z | [590,975] |
2022-03-25T00:00:00Z | [107] |
2022-03-26T00:00:00Z | [836] |
2022-03-27T00:00:00Z | [68] |
2022-03-28T00:00:00Z | [527,859] |
2022-03-29T00:00:00Z | [172,874] |
2022-03-30T00:00:00Z | [274] |