SQL：如何计算按日期排序的客户分组的唯一实例？

Question

我在 Snowflake 数据仓库中有以下 table：

Client_ID	Appointment_Date	Store_ID
Client_1	1/1/2021	Store_1
Client_2	1/1/2021	Store_1
Client_1	2/1/2021	Store_2
Client_2	2/1/2021	Store_1
Client_1	2021 年 3 月 1 日	Store_1
Client_2	2021 年 3 月 1 日	Store_1

我需要能够按 Appointment_Date 的顺序计算每个 Client_ID 的唯一 Store_ID 的数量。以下是我想要的输出：

Customer_ID	Appointment_Date	Store_ID	Count_Different_Stores
Client_1	1/1/2021	Store_1	1
Client_2	1/1/2021	Store_1	1
Client_1	2/1/2021	Store_2	2
Client_2	2/1/2021	Store_1	1
Client_1	2021 年 3 月 1 日	Store_1	2
Client_2	2021 年 3 月 1 日	Store_1	1

我会主动计算客户随时间访问的不同商店的数量。我试过：

SELECT Client_ID, Appointment_Date, Store_ID,
DENSE_RANK() OVER (PARTITION BY CLIENT_ID, STORE_ID ORDER BY APPOINTMENT_DATE)
FROM table

产生：

Customer_ID	Appointment_Date	Store_ID	Count_Different_Stores
Client_1	1/1/2021	Store_1	1
Client_2	1/1/2021	Store_1	1
Client_1	2/1/2021	Store_2	2
Client_2	2/1/2021	Store_1	2
Client_1	2021 年 3 月 1 日	Store_1	3
Client_2	2021 年 3 月 1 日	Store_1	3

并且：

SELECT Client_ID, Store_ID,
DENSE_RANK() OVER (PARTITION BY CLIENT_ID, STORE_ID)
FROM table
--With a join back to the original table with all my needed data

产生：

Customer_ID	Appointment_Date	Store_ID	Count_Different_Stores
Client_1	1/1/2021	Store_1	2
Client_2	1/1/2021	Store_1	1
Client_1	2/1/2021	Store_2	1
Client_2	2/1/2021	Store_1	1
Client_1	2021 年 3 月 1 日	Store_1	1
Client_2	2021 年 3 月 1 日	Store_1	1

第二个更接近我需要的，但是不同店铺的排名不一定占Appointment_Date的先后顺序，这点很关键。有时顺序是正确的，有时不是。

任何见解都有帮助，很乐意提供更多信息。

Answer 1

如果我没理解错的话，你想要一个累积 count(distinct) 作为一个 window 函数。 Snowflake 不直接支持它，但您可以使用 row_number() 和累计总和轻松计算它：

select t.*,
       sum( (seqnum = 1)::int) over (partition by client_id order by appointment_date) as num_distinct_stores
from (select t.*,
             row_number() over (partition by client_id, store_id order by appointment_date) as seqnum
      from t
     ) t;

SQL：如何计算按日期排序的客户分组的唯一实例？

SQL: How can I count unique instances grouped by client ordered by date?

sql

database

database-design

snowflake-cloud-data-platform