Select 具有不同字段的最后一行

Select Last Rows with Distinct Field

我有一个具有以下架构的 table:

id itemid date        some additional data
1   1000  10/12/2020       a
2   1000  10/12/2020       b
3   1002  09/12/2020       c
4   1001  07/12/2020       d
5   1000  05/12/2020       e
6   1005  03/12/2020       f
7   1003  03/12/2020       g

在此 table 中,只有 id 字段是唯一的。我关心的是获取包含最后 X 个不同 itemid 的行,按日期排序。

例如,在上面的示例中,如果我想获得最后 3 个不同的 itemid,我将获得前 4 行,因为在前 4 行中我们有三个不同的行itemid:1000、1002 和 1001。我不确定如何使用单个 SQL 语句实现此目的。

您可以使用如下解析函数:

select * from
(select t.*,
       conut(distinct item_id) over (order by date desc) as cnt
  from your_Table t) t
 where cnt <= 3

如果我没理解错的话,您想计算每一行(按日期)的不同项目 ID 的数量,并且 return 计数为三的所有行。

如果 Postgres 支持这个,你可以使用:

select t.*
from (select t.*, 
             count(*) filter (where id = min_id) over (order by date desc) as cnt_itemid
      from (select t.*,
                   min(id) over (partition by itemid order by date desc) as min_id
            from t
           ) t
     ) t
where cnt_itemid <= 3;

唉,Postgres 不支持 COUNT(DISTINCT) 作为 window 函数。但是您可以使用 DENSE_RANK():

来计算它
select t.*
from (select t.*, 
             count(*) over (filter where id = min_id) as cnt_itemid
      from (select t.*,
                   min(id) over (partition by itemid order by date) as min_id
            from t
           ) t
     ) t
where cnt_itemid <= 3;

但是,这个 returns 所有 是第 4 项之前的最新行 - 所以它有额外的行。

要获得四行,您需要项目 ID 为“3”的第一行。一种方法是:

select t.*
from (select t.*, min(id) filter (where cnt_itemid = 3) over () as min_cnt_itemid_3
      from (select t.*, 
                   count(*) filter (where id = min_id) over (order by date desc) as cnt_itemid
            from (select t.*,
                         min(id) over (partition by itemid order by date desc) as min_id
                  from t
                 ) t
           ) t
     ) t
where id <= min_cnt_itemid_3;

您也可以通过识别第一次出现的“第三项”然后选择直到该行的所有行来执行此操作:

select t.*
from t join
     (select itemid, min(max_date) over () as min_max_date
      from (select t.itemid, max(date) as max_date
            from t
            group by t.itemid
            order by max(t.date) desc
            limit 3
           ) t
      ) tt
      on t.itemid = tt.itemid and t.date >= tt.min_max_date;

This fiddle 显示了其中的每一个。