在限定的时间 window 内为关系计算每一行的总计和百分比
Calculating totals and percentages for each row, in a time boxed window, for a relation
好的,我有两个表:jobs 和 job 运行s。我正在使用 Postgres。
我想看2期。 7天前到现在,14天前到7天前
对于每项工作,我想要 运行 的总数,以及每个时期成功和不成功的百分比 运行。我编造了这个糟糕的查询:
WITH results AS (
select
coalesce(count(case when succeeded = true AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_passes,
coalesce(count(case when succeeded = false AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_failures,
coalesce(count(case when timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_total_runs,
coalesce(count(case when infrastructure_failure = true AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_infrastructure_failures,
coalesce(count(case when succeeded = true AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_passes,
coalesce(count(case when succeeded = false AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_failures,
coalesce(count(case when timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_total_runs,
coalesce(count(case when infrastructure_failure = true AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_infrastructure_failures
FROM
prow_job_runs JOIN prow_jobs ON prow_jobs.id = prow_job_runs.prow_job_id WHERE prow_jobs.name = 'promote-release-openshift-machine-os-content-e2e-aws-4.10'
)
SELECT *,
previous_passes * 100.0 / NULLIF(previous_total_runs, 0) AS previous_pass_percentage,
previous_failures * 100.0 / NULLIF(previous_total_runs, 0) AS previous_failure_percentage,
current_passes * 100.0 / NULLIF(current_total_runs, 0) AS current_pass_percentage,
current_failures * 100.0 / NULLIF(current_total_runs, 0) AS current_failure_percentage
FROM results;
得到我想要的结果:
-[ RECORD 1 ]--------------------+-----------------------
previous_passes | 591
previous_failures | 4
previous_total_runs | 595
previous_infrastructure_failures | 1
current_passes | 67
current_failures | 0
current_total_runs | 67
current_infrastructure_failures | 0
previous_pass_percentage | 99.3277310924369748
previous_failure_percentage | 0.67226890756302521008
current_pass_percentage | 100.0000000000000000
current_failure_percentage | 0.00000000000000000000
执行计划如下:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Subquery Scan on results (cost=661.12..661.19 rows=1 width=192)
-> Aggregate (cost=661.12..661.13 rows=1 width=64)
-> Hash Join (cost=8.30..650.89 rows=93 width=10)
Hash Cond: (prow_job_runs.prow_job_id = prow_jobs.id)
-> Seq Scan on prow_job_runs (cost=0.00..603.60 rows=14460 width=18)
-> Hash (cost=8.29..8.29 rows=1 width=8)
-> Index Scan using prow_jobs_name_key on prow_jobs (cost=0.27..8.29 rows=1 width=8)
Index Cond: (name = 'promote-release-openshift-machine-os-content-e2e-aws-4.10'::text)
(8 rows)
但它仅适用于单个作业,如何在不在代码中执行 for 循环的情况下获得每个作业的结果?
我也认为我的查询真的很慢,仅一项作业就超过 8 毫秒 运行。
TY
您需要提供查询 execution plan
。
但你必须确保你有必要的索引,也可能你限制你加入的行数它会有所帮助:
WITH results AS (
select prow_jobs.name,
coalesce(count(case when succeeded = true AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_passes,
coalesce(count(case when succeeded = false AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_failures,
coalesce(count(case when timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_total_runs,
coalesce(count(case when infrastructure_failure = true AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_infrastructure_failures,
coalesce(count(case when succeeded = true AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_passes,
coalesce(count(case when succeeded = false AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_failures,
coalesce(count(case when timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_total_runs,
coalesce(count(case when infrastructure_failure = true AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_infrastructure_failures
FROM prow_job_runs
JOIN prow_jobs
ON prow_jobs.id = prow_job_runs.prow_job_id
and timestamp BETWEEN NOW() and now() - INTERVAL '14 DAY'
group by prow_jobs.name
)
SELECT *,
previous_passes * 100.0 / NULLIF(previous_total_runs, 0) AS previous_pass_percentage,
previous_failures * 100.0 / NULLIF(previous_total_runs, 0) AS previous_failure_percentage,
current_passes * 100.0 / NULLIF(current_total_runs, 0) AS current_pass_percentage,
current_failures * 100.0 / NULLIF(current_total_runs, 0) AS current_failure_percentage
FROM results;
并且您似乎在 prow_job_runs table 上没有任何索引,请在 table 上添加一个包含列的索引 (id,succeeded,infrastructure_failure,时间戳,prow_job_id)
好的,我有两个表:jobs 和 job 运行s。我正在使用 Postgres。
我想看2期。 7天前到现在,14天前到7天前
对于每项工作,我想要 运行 的总数,以及每个时期成功和不成功的百分比 运行。我编造了这个糟糕的查询:
WITH results AS (
select
coalesce(count(case when succeeded = true AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_passes,
coalesce(count(case when succeeded = false AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_failures,
coalesce(count(case when timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_total_runs,
coalesce(count(case when infrastructure_failure = true AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_infrastructure_failures,
coalesce(count(case when succeeded = true AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_passes,
coalesce(count(case when succeeded = false AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_failures,
coalesce(count(case when timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_total_runs,
coalesce(count(case when infrastructure_failure = true AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_infrastructure_failures
FROM
prow_job_runs JOIN prow_jobs ON prow_jobs.id = prow_job_runs.prow_job_id WHERE prow_jobs.name = 'promote-release-openshift-machine-os-content-e2e-aws-4.10'
)
SELECT *,
previous_passes * 100.0 / NULLIF(previous_total_runs, 0) AS previous_pass_percentage,
previous_failures * 100.0 / NULLIF(previous_total_runs, 0) AS previous_failure_percentage,
current_passes * 100.0 / NULLIF(current_total_runs, 0) AS current_pass_percentage,
current_failures * 100.0 / NULLIF(current_total_runs, 0) AS current_failure_percentage
FROM results;
得到我想要的结果:
-[ RECORD 1 ]--------------------+-----------------------
previous_passes | 591
previous_failures | 4
previous_total_runs | 595
previous_infrastructure_failures | 1
current_passes | 67
current_failures | 0
current_total_runs | 67
current_infrastructure_failures | 0
previous_pass_percentage | 99.3277310924369748
previous_failure_percentage | 0.67226890756302521008
current_pass_percentage | 100.0000000000000000
current_failure_percentage | 0.00000000000000000000
执行计划如下:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Subquery Scan on results (cost=661.12..661.19 rows=1 width=192)
-> Aggregate (cost=661.12..661.13 rows=1 width=64)
-> Hash Join (cost=8.30..650.89 rows=93 width=10)
Hash Cond: (prow_job_runs.prow_job_id = prow_jobs.id)
-> Seq Scan on prow_job_runs (cost=0.00..603.60 rows=14460 width=18)
-> Hash (cost=8.29..8.29 rows=1 width=8)
-> Index Scan using prow_jobs_name_key on prow_jobs (cost=0.27..8.29 rows=1 width=8)
Index Cond: (name = 'promote-release-openshift-machine-os-content-e2e-aws-4.10'::text)
(8 rows)
但它仅适用于单个作业,如何在不在代码中执行 for 循环的情况下获得每个作业的结果?
我也认为我的查询真的很慢,仅一项作业就超过 8 毫秒 运行。
TY
您需要提供查询 execution plan
。
但你必须确保你有必要的索引,也可能你限制你加入的行数它会有所帮助:
WITH results AS (
select prow_jobs.name,
coalesce(count(case when succeeded = true AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_passes,
coalesce(count(case when succeeded = false AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_failures,
coalesce(count(case when timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_total_runs,
coalesce(count(case when infrastructure_failure = true AND timestamp BETWEEN NOW() - INTERVAL '14 DAY' AND NOW() - INTERVAL '7 DAY' then 1 end), 0) as previous_infrastructure_failures,
coalesce(count(case when succeeded = true AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_passes,
coalesce(count(case when succeeded = false AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_failures,
coalesce(count(case when timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_total_runs,
coalesce(count(case when infrastructure_failure = true AND timestamp > NOW() - INTERVAL '7 DAY' then 1 end), 0) as current_infrastructure_failures
FROM prow_job_runs
JOIN prow_jobs
ON prow_jobs.id = prow_job_runs.prow_job_id
and timestamp BETWEEN NOW() and now() - INTERVAL '14 DAY'
group by prow_jobs.name
)
SELECT *,
previous_passes * 100.0 / NULLIF(previous_total_runs, 0) AS previous_pass_percentage,
previous_failures * 100.0 / NULLIF(previous_total_runs, 0) AS previous_failure_percentage,
current_passes * 100.0 / NULLIF(current_total_runs, 0) AS current_pass_percentage,
current_failures * 100.0 / NULLIF(current_total_runs, 0) AS current_failure_percentage
FROM results;
并且您似乎在 prow_job_runs table 上没有任何索引,请在 table 上添加一个包含列的索引 (id,succeeded,infrastructure_failure,时间戳,prow_job_id)