使用 order by 从数据库中检索每个组中的最后一条记录

Retrieving last record in each group from database with order by

有一个 table ticket 包含如下所示的数据:

Id   Impact   group        create_date
------------------------------------------
1     3        ABC       2020-07-28 00:42:00.0
1     2        ABC       2020-07-28 00:45:00.0
1     3        ABC       2020-07-28 00:48:00.0
1     3        ABC       2020-07-28 00:52:00.0
1     3        XYZ       2020-07-28 00:55:00.0
1     3        XYZ       2020-07-28 00:59:00.0

预期结果:

Id   Impact   group        create_date
------------------------------------------
1     3        ABC       2020-07-28 00:42:00.0
1     2        ABC       2020-07-28 00:45:00.0
1     3        ABC       2020-07-28 00:52:00.0
1     3        XYZ       2020-07-28 00:59:00.0

目前,这是我使用的查询:

WITH final AS (
    SELECT p.*, 
           ROW_NUMBER() OVER(PARTITION BY p.id,p.group,p.impact
                                 ORDER BY p.create_date desc, p.impact) AS rk
      FROM ticket p 
)
SELECT f.*
  FROM final f 
 WHERE f.rk = 1

我得到的结果是:

Id   Impact    group         create_date
-----------------------------------------
1     2        ABC       2020-07-28 00:45:00.0
1     3        ABC       2020-07-28 00:52:00.0
1     3        XYZ       2020-07-28 00:59:00.0

似乎分区依据优先于值排序。还有其他方法可以达到预期结果吗?我在 amazon Redshift 上 运行 这些查询。

您可以使用 LEAD() 检查行之间的 Impact 是否发生变化,只获取值将发生变化的行。

WITH
  look_forward AS
(
  SELECT
    *,
    LEAD(impact) OVER (PARTITION BY id, group ORDER BY create_date) AS lead_impact
  FROM
    ticket
)
SELECT
  *
FROM
  look_forward 
WHERE
     lead_impact IS NULL
  OR lead_impact <> impact

您似乎想要 id/impact/group 相对于 下一个 行发生变化的行。一个简单的方法是查看下一个 create_date 总体和下一个 create_date 的组。如果这些相同,则过滤:

select t.*
from (select t.*,
             lead(create_date) over (order by create_date) as next_create_date,
             lead(create_date) over (partition by id, impact, group order by create_date) as next_create_date_img
      from ticket t
     ) t
where next_create_date_img is null or next_create_date_img <> next_create_date;