计算与 DISTINCT ON 中使用的列不同的唯一行 GROUP(ed)

Count unique rows GROUP(ed) BY different columns than used in DISTINCT ON

我敢肯定这个问题已经被问过一遍又一遍,但我找不到一个我可以完全理解的简单示例。

我正在尝试对一列进行重复数据删除(执行 DISTINCT ON),并 COUNT 记录与用于重复数据删除的列不同的 GROUPed By 列,但不引入子查询。

假设我有一个包含以下信息的 table:

order_num date region timestamp_updated
001 2021-09-01 Murica 2021-09-02T19:00:01Z
001 2021-09-01 Murica 2021-09-03T19:00:01Z
002 2021-09-01 Yurop 2021-09-02T19:00:01Z
003 2021-09-01 Yurop 2021-09-03T19:00:01Z
004 2021-09-02 Yurop 2021-09-03T19:00:01Z

我想首先获得具有不同 order_num(保持最近更新)AND 的唯一记录,然后按 date 对组或订单进行计数和 region.

我知道如何分别执行这两件事 (distinct on(order_num) + order by timestamp_updated desc) 去重,然后 select count(*) + group by date, region ) 甚至与子查询一起执行。但我想尽量避免子查询,这里是 window 函数(似乎)派上用场的地方,我不知道 much 任何关于那些.

我能得到的最接近的是组,但它们每个 order_num 显示一条记录。记录正确,但重复:

select distinct on (order_num) date, region, count(1)over (
    partition by order_num
)
from orders_table
order by order_num, timestamp_updated desc;

该查询 ^^ 显示:

date region count
2021-09-01 Murica 1 I think this is the first 001
2021-09-01 Murica 1 I think this is the second 001
2021-09-01 Yurop 2 I think this is the first Yurop: 002
2021-09-01 Yurop 2 I think this is the second Yurop: 003
2021-09-02 Yurop 1

您可以获得每个 order_num, date, region 的最大值 timestamp_updated,然后使用 window function

再次聚合以获得每个 date, region 的计数
select distinct 
       date, 
       region, 
       count(max(timestamp_updated)) over (partition by date, region) as counts 
from t
group by order_num, date, region;

DEMO