oracle SQL select 过去 x 天滚动期间的不同客户
oracle SQL select the distinct customers in the past x day rolling period
假设您有 table 位客户,其日期如下:
[customer_table]
+----------+-----------+----------+
| customer | date | purchase |
+----------+-----------+----------+
| 1 | 1/01/2016 | 12 |
+----------+-----------+----------+
| 1 | 1/12/2016 | 3 |
+----------+-----------+----------+
| 2 | 5/03/2016 | 5 |
+----------+-----------+----------+
| 3 | 1/16/2016 | 6 |
+----------+-----------+----------+
| 3 | 3/22/2016 | 1 |
+----------+-----------+----------+
我想编写一个查询来计算在过去 10 天内有多少不同的客户进行了购买,作为一个滚动周期,从每个日历日开始并向后计算 10 天。因此,对于 2016 年的每一天,最终输出将是一个日历,其中每一天都有一个不同的客户计数,这些客户存在于日历当天的前 10 天,如下所示:
[result_table]
+-----------+------------------+
| date | unique customers |
+-----------+------------------+
| 1/01/2016 | 112 |
+-----------+------------------+
| 1/02/2016 | 104 |
+-----------+------------------+
| 1/03/2016 | 140 |
+-----------+------------------+
| 1/04/2016 | 133 |
+-----------+------------------+
| .... | 121 |
+-----------+------------------+
我想出的一个解决方案是创建一个只有一列的日历 table,然后使用不等式连接将日历 table 连接到客户 table。我认为这是非常低效的,并且正在寻求更快的解决方案。所以我的第一步是像这样创建一个日历:
[日历]
+-----------+
| date |
+-----------+
| 1/01/2016 |
+-----------+
| 1/02/2016 |
+-----------+
| 1/03/2016 |
+-----------+
| 1/04/2016 |
+-----------+
| 1/05/2016 |
+-----------+
然后对于该日历中的每一天,为了计算每一天之前的不同客户集,我加入了一个不等式,如下所示:
select
count(distinct customer) as unique customers
from calendar c
left join mytable m
on c.date>=m.date and m.date>=c.date-10
虽然我相信这是正确的,但它运行得非常慢(假设一个日历有 2 年的时间有几百万客户)。是否有 oracle 分析函数可以帮到我?
Is there an oracle analytic function that may help me out here?
不是真的 - 来自 COUNT()
documentation:
If you specify DISTINCT
, then you can specify only the query_partition_clause
of the analytic_clause. The order_by_clause
and windowing_clause
are not allowed.
你会想要 DISTINCT
和 windowing_clause
,这是不允许的。
更新:
您可以使用按客户分区的非DISTINCT
分析查询然后按天聚合的组合来获得与无效语法相同的效果:
Oracle 设置:
CREATE TABLE table_name ( customer, dt ) AS
SELECT 1, DATE '2017-01-10' FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-11' FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-15' FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-20' FROM DUAL UNION ALL
SELECT 2, DATE '2017-01-12' FROM DUAL UNION ALL
SELECT 2, DATE '2017-01-19' FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-10' FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-13' FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-15' FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-20' FROM DUAL;
查询:
注:下面的查询只是查询一个月的数据和前两天的范围来说明原理,但是把参数改成12个月10天也很容易。
SELECT day,
SUM( has_order_in_range ) AS unique_customers
FROM (
SELECT customer,
day,
LEAST(
1,
COUNT(dt) OVER ( PARTITION BY customer
ORDER BY day
RANGE BETWEEN INTERVAL '2' DAY PRECEDING
AND INTERVAL '0' DAY FOLLOWING )
) AS has_order_in_range
FROM table_name t
PARTITION BY ( customer )
RIGHT OUTER JOIN
( -- Create a calendar for one month
SELECT DATE '2017-01-01' + LEVEL - 1 AS day
FROM DUAL
CONNECT BY DATE '2017-01-01' + LEVEL - 1 < ADD_MONTHS( DATE '2017-01-01', 1 )
) d
ON ( t.dt = d.day )
)
GROUP BY day
ORDER BY day;
输出:
DAY UNIQUE_CUSTOMERS
------------------- ----------------
2017-01-01 00:00:00 0
2017-01-02 00:00:00 0
2017-01-03 00:00:00 0
2017-01-04 00:00:00 0
2017-01-05 00:00:00 0
2017-01-06 00:00:00 0
2017-01-07 00:00:00 0
2017-01-08 00:00:00 0
2017-01-09 00:00:00 0
2017-01-10 00:00:00 2
2017-01-11 00:00:00 2
2017-01-12 00:00:00 3
2017-01-13 00:00:00 3
2017-01-14 00:00:00 2
2017-01-15 00:00:00 2
2017-01-16 00:00:00 2
2017-01-17 00:00:00 2
2017-01-18 00:00:00 0
2017-01-19 00:00:00 1
2017-01-20 00:00:00 3
2017-01-21 00:00:00 3
2017-01-22 00:00:00 2
2017-01-23 00:00:00 0
2017-01-24 00:00:00 0
2017-01-25 00:00:00 0
2017-01-26 00:00:00 0
2017-01-27 00:00:00 0
2017-01-28 00:00:00 0
2017-01-29 00:00:00 0
2017-01-30 00:00:00 0
2017-01-31 00:00:00 0
假设您有 table 位客户,其日期如下:
[customer_table]
+----------+-----------+----------+
| customer | date | purchase |
+----------+-----------+----------+
| 1 | 1/01/2016 | 12 |
+----------+-----------+----------+
| 1 | 1/12/2016 | 3 |
+----------+-----------+----------+
| 2 | 5/03/2016 | 5 |
+----------+-----------+----------+
| 3 | 1/16/2016 | 6 |
+----------+-----------+----------+
| 3 | 3/22/2016 | 1 |
+----------+-----------+----------+
我想编写一个查询来计算在过去 10 天内有多少不同的客户进行了购买,作为一个滚动周期,从每个日历日开始并向后计算 10 天。因此,对于 2016 年的每一天,最终输出将是一个日历,其中每一天都有一个不同的客户计数,这些客户存在于日历当天的前 10 天,如下所示:
[result_table]
+-----------+------------------+
| date | unique customers |
+-----------+------------------+
| 1/01/2016 | 112 |
+-----------+------------------+
| 1/02/2016 | 104 |
+-----------+------------------+
| 1/03/2016 | 140 |
+-----------+------------------+
| 1/04/2016 | 133 |
+-----------+------------------+
| .... | 121 |
+-----------+------------------+
我想出的一个解决方案是创建一个只有一列的日历 table,然后使用不等式连接将日历 table 连接到客户 table。我认为这是非常低效的,并且正在寻求更快的解决方案。所以我的第一步是像这样创建一个日历:
[日历]
+-----------+
| date |
+-----------+
| 1/01/2016 |
+-----------+
| 1/02/2016 |
+-----------+
| 1/03/2016 |
+-----------+
| 1/04/2016 |
+-----------+
| 1/05/2016 |
+-----------+
然后对于该日历中的每一天,为了计算每一天之前的不同客户集,我加入了一个不等式,如下所示:
select
count(distinct customer) as unique customers
from calendar c
left join mytable m
on c.date>=m.date and m.date>=c.date-10
虽然我相信这是正确的,但它运行得非常慢(假设一个日历有 2 年的时间有几百万客户)。是否有 oracle 分析函数可以帮到我?
Is there an oracle analytic function that may help me out here?
不是真的 - 来自 COUNT()
documentation:
If you specify
DISTINCT
, then you can specify only thequery_partition_clause
of the analytic_clause. Theorder_by_clause
andwindowing_clause
are not allowed.
你会想要 DISTINCT
和 windowing_clause
,这是不允许的。
更新:
您可以使用按客户分区的非DISTINCT
分析查询然后按天聚合的组合来获得与无效语法相同的效果:
Oracle 设置:
CREATE TABLE table_name ( customer, dt ) AS
SELECT 1, DATE '2017-01-10' FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-11' FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-15' FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-20' FROM DUAL UNION ALL
SELECT 2, DATE '2017-01-12' FROM DUAL UNION ALL
SELECT 2, DATE '2017-01-19' FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-10' FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-13' FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-15' FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-20' FROM DUAL;
查询:
注:下面的查询只是查询一个月的数据和前两天的范围来说明原理,但是把参数改成12个月10天也很容易。
SELECT day,
SUM( has_order_in_range ) AS unique_customers
FROM (
SELECT customer,
day,
LEAST(
1,
COUNT(dt) OVER ( PARTITION BY customer
ORDER BY day
RANGE BETWEEN INTERVAL '2' DAY PRECEDING
AND INTERVAL '0' DAY FOLLOWING )
) AS has_order_in_range
FROM table_name t
PARTITION BY ( customer )
RIGHT OUTER JOIN
( -- Create a calendar for one month
SELECT DATE '2017-01-01' + LEVEL - 1 AS day
FROM DUAL
CONNECT BY DATE '2017-01-01' + LEVEL - 1 < ADD_MONTHS( DATE '2017-01-01', 1 )
) d
ON ( t.dt = d.day )
)
GROUP BY day
ORDER BY day;
输出:
DAY UNIQUE_CUSTOMERS
------------------- ----------------
2017-01-01 00:00:00 0
2017-01-02 00:00:00 0
2017-01-03 00:00:00 0
2017-01-04 00:00:00 0
2017-01-05 00:00:00 0
2017-01-06 00:00:00 0
2017-01-07 00:00:00 0
2017-01-08 00:00:00 0
2017-01-09 00:00:00 0
2017-01-10 00:00:00 2
2017-01-11 00:00:00 2
2017-01-12 00:00:00 3
2017-01-13 00:00:00 3
2017-01-14 00:00:00 2
2017-01-15 00:00:00 2
2017-01-16 00:00:00 2
2017-01-17 00:00:00 2
2017-01-18 00:00:00 0
2017-01-19 00:00:00 1
2017-01-20 00:00:00 3
2017-01-21 00:00:00 3
2017-01-22 00:00:00 2
2017-01-23 00:00:00 0
2017-01-24 00:00:00 0
2017-01-25 00:00:00 0
2017-01-26 00:00:00 0
2017-01-27 00:00:00 0
2017-01-28 00:00:00 0
2017-01-29 00:00:00 0
2017-01-30 00:00:00 0
2017-01-31 00:00:00 0