SQL:连续行之间的差异
SQL: Difference between consecutive rows
Table 有 3 列:订单号、会员号、订单日期
需要拉取按天数细分的订单分布b/w按会员id连续2个订单
我的是这样的:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id+1;
它并没有完全帮助我,因为我需要的输出是:
您可以使用 lag()
获取同一客户上次订单的日期:
select o.*,
datediff(
order_date,
lag(order_date) over(partition by member_id order by order_date, order_id)
) days_diff
from orders o
当同一日期有两行时,首先考虑最小的order_id
。另请注意,我修复了您的 datediff()
语法:在 Hive 中,该函数只需要两个日期,没有单位。
我只是不明白你想要计算的逻辑 num_orders
。
可能是这样的:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id
where not exists (
select intermediate_order
from orders as intermedite_order
where intermediate_order.order_date < a1.order_date and intermediate_order.order_date > a2.order_date) ;
Table 有 3 列:订单号、会员号、订单日期
需要拉取按天数细分的订单分布b/w按会员id连续2个订单
我的是这样的:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id+1;
它并没有完全帮助我,因为我需要的输出是:
您可以使用 lag()
获取同一客户上次订单的日期:
select o.*,
datediff(
order_date,
lag(order_date) over(partition by member_id order by order_date, order_id)
) days_diff
from orders o
当同一日期有两行时,首先考虑最小的order_id
。另请注意,我修复了您的 datediff()
语法:在 Hive 中,该函数只需要两个日期,没有单位。
我只是不明白你想要计算的逻辑 num_orders
。
可能是这样的:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id
where not exists (
select intermediate_order
from orders as intermedite_order
where intermediate_order.order_date < a1.order_date and intermediate_order.order_date > a2.order_date) ;