Promql：是否有可能在 Query_Range 中获得总计数

Question

例如，我有一个 prometheus 查询，return HTTP 状态 200 为“1”，HTTP 状态为 200 以外的状态为“0”。现在，我使用 query_range api 传递时间范围（开始和结束）和步骤。

API-Endpoint: http://my-prometheus.com/api/v1/query_range
Query: http_response_ok{appname="XXX"}
Start: 2020-06-17T00:00:00
end:2020-06-17T23:59:59
step: 300000ms     (=5min)

上面的查询return我以“0”和“1”的形式获取全天每5分钟的数据。总分约289分

是否可以获得特定时间段内所有“1”和“0”的总数？我试过 count_over_time 给出了总数。如何添加过滤器，以便在值 == 0 或 1

时 return 计数

count_over_time(http_response_ok{appname="XXX"}[24h])

仅供参考，实际查询不是 http_request，我不能使用 http_request_total

Answer 1

经过一些研究，我找到了答案。基本上在 {} 中，我们正在检查 b/w 标签。在 {} 之外，我们可以为值设置条件。

因此，要查找过去 24 小时内值 ==1 的总计数，查询应如下所示：

count_over_time(http_response_ok{appname="XXX"==1}[24h:])

并且要查找过去 24 小时内值为 ==0 的总计数，查询应如下所示：

count_over_time(http_response_ok{appname="XXX"==0}[24h:])

Answer 2

注意/api/v1/query_range returns calculated results instead of raw samples stored in the database. It returns exactly 1 + (end - start) / step samples on the [start ... end] time range with step interval between them, where start, end and step are the corresponding query args passed to /api/v1/query_range. See these docs for details on how Prometheus calculates the returned results. If you need to obtain raw samples, then a range query must be sent to /api/v1/query. For example, /api/v1/query?query=http_response_ok[24h]&time=t would return raw samples on the time range (t-24h ... t]. See this article了解详情。

如果 http_response_ok 时间序列只能有 0 或 1 个值，则可以使用以下查询返回具有 0 和 1 值的原始样本的确切数量：

过去 24 小时内具有 1 值的原始样本数量：

avg_over_time(http_response_ok[24h]) * count_over_time(http_response_ok[24h])

过去 24 小时内具有 0 值的原始样本数量：

(1 - avg_over_time(http_response_ok[24h])) * count_over_time(http_response_ok[24h])

这些查询是如何工作的？他们使用 avg_over_time() function for calculating the average value for raw samples over the last 24 hours. Internally this value is calculated as sum(raw_samples) / count(raw_samples). Then the result is multiplied by count_over_time()，其中 returns 过去 24 小时内的原始样本数量，例如它等于 count(raw_samples).

所以第一个查询等同于sum(raw_samples) / count(raw_samples) * count(raw_samples) = sum(raw_samples)。由于 raw_samples 可能只有 0 和 1 值，那么 sum(raw_samples) = count(raw_samples_equal_to_1).

第二个查询等于(1 - sum(raw_samples)/count(raw_samples)) * count(raw_samples) = count(raw_samples) - sum(raw_samples) = count(raw_samples) - count(raw_samples_equal_to_1) = count(raw_samples_equal_to_0)。

如果 http_response_ok 时间序列可以包含 0 和 1 以外的其他值，则上面列出的查询将不起作用。在这种情况下，来自 MetricsQL 的 count_gt_over_time, count_le_over_time, count_eq_over_time and count_ne_over_time 函数可能会有所帮助。

Promql：是否有可能在 Query_Range 中获得总计数

Promql: Is it possible to get total count in Query_Range

prometheus

promql