从 Postgres 11.6 中的循环开始选择事件

Selecting events from the start of a cycle in Postgres 11.6

我正在尝试使用 Postgres 11.6 编写一个查询,以在分组期间根据事件在序列中的位置过滤掉事件。我们有很多事件数据分组在 "cycles" 中。这是 table 设置:

这是 table 设置:

CREATE TABLE IF NOT EXISTS data.event (
    id uuid NOT NULL,
    cycle_id uuid,
    dts timestamp without time zone,
    state citext,
    PRIMARY KEY(id)
);

以下是来自三个不同周期的一些示例记录:

id                                      cycle_id                                dts                    state
f5c1ecef-b917-6c44-ae18-ed8c8b9e71d5    00000910-da83-7c4b-b47a-38be998471d1    2019-05-15 15:54:17    Boxed
dbf713c9-802d-a445-9cf8-ff2953d1ef6c    00000910-da83-7c4b-b47a-38be998471d1    2019-05-16 09:08:23    Assembly
93ea8e6a-2c41-d848-805b-dc83d572af1e    00000910-da83-7c4b-b47a-38be998471d1    2019-05-16 09:34:51    Waiting For Quality
bc30d154-8c72-7e40-b75d-048b7f2ca3d4    00000910-da83-7c4b-b47a-38be998471d1    2019-05-16 10:43:31    Quality
0b69895a-36fd-274b-8730-e81206636013    00000910-da83-7c4b-b47a-38be998471d1    2019-05-16 13:02:07    Waiting After Quality

550a0b9e-7731-4347-ba55-4ca222b3e69f    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 13:39:59    Boxed
1c43d256-807c-8643-82a8-1c32653cedb3    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 19:14:36    Waiting For Assembly
9b4e8e70-d383-4b47-aed4-4aa9de3b5b5d    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 20:13:59    Assembly
1d99218c-ea25-6c4d-91a5-07d5198971f3    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 20:16:11    Waiting For Quality
b9571373-4de1-cc4c-badc-58cbaca2de1a    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 21:36:59    Quality
1d233ed1-ea57-f741-af76-cf2a8630f5d7    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 23:40:04    Waiting After Quality
db305c40-485b-b542-8c05-739ea46fa006    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-16 22:15:13    Storage
2afa5d2c-f061-4c47-be85-010010a12785    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-16 23:01:28    Vendor Dropoff

43a35799-d58e-5049-b9ca-12e0980407b1    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-08 14:09:56    Boxed
8db57a1a-6f02-7c46-81ca-f1e6acb464cd    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-10 12:18:02    Assembly
74d53a8d-5e22-274d-b429-b3bfa89b0211    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-10 12:18:25    Waiting For Quality
8517b241-1a2f-2c47-80f5-f910a7b0729c    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-10 12:25:14    Quality
ebaaf219-7319-084d-9179-ec7b1f4bffdf    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-10 14:53:42    Waiting After Quality
3ee6ae5b-9e44-4944-9aad-b084552309af    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-14 00:08:19    Storage
cea8cfdb-533d-5e42-b136-b9bfd5d69eee    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-14 23:00:00    Transport
b27e43da-df4f-f642-8158-25efa2f841a4    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-14 14:09:17    Return to Vendor

实际上,循环包含更多步骤,我只包含足够的示例。这里的目标是过滤掉特定周期阶段后的事件,"Waiting After Quality" 或 "Storage"。每个周期都有这些事件中的一个或两个。写成命令式,每个循环的逻辑是这样的:

If (The cycle includes "Storage") then
    Keep only the cycle data up to and include "Storage"
Else
    Keep only the cycle data up to and including "Waiting After Quality"
End if

目标结果仅包括这些行:

id                                      cycle_id                                dts                    state
f5c1ecef-b917-6c44-ae18-ed8c8b9e71d5    00000910-da83-7c4b-b47a-38be998471d1    2019-05-15 15:54:17    Boxed
dbf713c9-802d-a445-9cf8-ff2953d1ef6c    00000910-da83-7c4b-b47a-38be998471d1    2019-05-16 09:08:23    Assembly
93ea8e6a-2c41-d848-805b-dc83d572af1e    00000910-da83-7c4b-b47a-38be998471d1    2019-05-16 09:34:51    Waiting For Quality
bc30d154-8c72-7e40-b75d-048b7f2ca3d4    00000910-da83-7c4b-b47a-38be998471d1    2019-05-16 10:43:31    Quality
0b69895a-36fd-274b-8730-e81206636013    00000910-da83-7c4b-b47a-38be998471d1    2019-05-16 13:02:07    Waiting After Quality

550a0b9e-7731-4347-ba55-4ca222b3e69f    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 13:39:59    Boxed
1c43d256-807c-8643-82a8-1c32653cedb3    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 19:14:36    Waiting For Assembly
9b4e8e70-d383-4b47-aed4-4aa9de3b5b5d    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 20:13:59    Assembly
1d99218c-ea25-6c4d-91a5-07d5198971f3    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 20:16:11    Waiting For Quality
b9571373-4de1-cc4c-badc-58cbaca2de1a    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 21:36:59    Quality
1d233ed1-ea57-f741-af76-cf2a8630f5d7    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-15 23:40:04    Waiting After Quality
db305c40-485b-b542-8c05-739ea46fa006    0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    2019-04-16 22:15:13    Storage

43a35799-d58e-5049-b9ca-12e0980407b1    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-08 14:09:56    Boxed
8db57a1a-6f02-7c46-81ca-f1e6acb464cd    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-10 12:18:02    Assembly
74d53a8d-5e22-274d-b429-b3bfa89b0211    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-10 12:18:25    Waiting For Quality
8517b241-1a2f-2c47-80f5-f910a7b0729c    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-10 12:25:14    Quality
ebaaf219-7319-084d-9179-ec7b1f4bffdf    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-10 14:53:42    Waiting After Quality
3ee6ae5b-9e44-4944-9aad-b084552309af    00010d69-51d0-8b4f-9acb-1573a97517cd    2019-08-14 00:08:19    Storage

一个简单的分组查询按时间顺序排列周期事件:

  select cycle_id,
         string_agg(state::text, ', ' order by dts asc) as states
     from event   
 group by cycle_id 



cycle_id                                states
00000910-da83-7c4b-b47a-38be998471d1    Boxed, Assembly, Waiting For Quality, Quality, Waiting After Quality
0000cf14-e5d3-4c4c-9e44-b39bc5f7fa7b    Boxed, Waiting For Assembly, Assembly, Waiting For Quality, Quality, Waiting After Quality, Storage, Vendor Dropoff
00010d69-51d0-8b4f-9acb-1573a97517cd    Boxed, Assembly, Waiting For Quality, Quality, Waiting After Quality, Storage, Return to Vendor, Transport

我需要找到从每个周期开始直到并包括 "Storage" 或 "Waiting After Quality" 的事件,如果两者都存在,则使用 "Storage"。

我被难住了。我知道如何用客户端语言写这个,但不是 SQL。我 非常 更愿意在 SQL 中完成它。如果有人能告诉我怎么做,或者给我指出正确的方向,那就太好了。

答案

这是一个最终的解决方案,对已接受的答案进行了一些小的语法更改。

SELECT event.*
FROM data.event
JOIN (
  SELECT  cycle_id,
         COALESCE(
            MIN(dts) FILTER (WHERE state = 'Storage'),
            MIN(dts) FILTER (WHERE state = 'Waiting After Quality')) as upto

     FROM data.event

 GROUP BY cycle_id

   ) last_events USING (cycle_id)

WHERE dts <= upto

ORDER BY cycle_id,
         dts

我建议首先找到您想要 select 事件的每个周期的时间戳:

SELECT
  cycle_id,
  COALESCE(
    MIN(dts) FILTER (WHERE state = "Storage"),
    MIN(dts) FILTER (WHERE state = "Waiting After Quality")
  ) as upto
FROM data.event
GROUP BY cycle_id

然后您可以将这些结果与所有事件结合起来,并根据各自的条件进行过滤:

SELECT event.*
FROM data.event
JOIN (
  SELECT
    cycle_id,
    COALESCE(
      MIN(dts) FILTER (WHERE state = "Storage"),
      MIN(dts) FILTER (WHERE state = "Waiting After Quality")
    ) as upto
  FROM data.event
  GROUP BY cycle_id
) last_events USING (cycle_id)
WHERE dts <= upto