如何将相关行的值添加到 SQL 中的列
how to add values of related rows to a column in SQL
我有一个包含合同信息的 table,我想添加一个计算列,用于标识合同何时与同一客户的前一个合同连续。因此,当合同的结束日期与同一客户的下一份合同的开始日期相匹配时,我们认为这是连续的。
数据如下所示:
我希望它看起来像这样:
我尝试对合约 table 进行内部连接,然后将其合并,但我认为这不是最有效的方法。
您知道实现此目标的更好方法吗?
提前致谢。
以下代码已在 SQL Server 上测试,但也可以在 Presto 上运行。 SQL Server 有 bit
值而不是 boolean
s,所以我返回了 TRUE
和 FALSE
作为字符串.希望您能够进行任何必要的修改。
SELECT
c.contract_id
,c.client
,c.start_date
,c.end_date
,CASE
WHEN LAG(c.end_date) OVER (PARTITION BY c.client ORDER BY c.start_date) = c.start_date
THEN 'TRUE' ELSE 'FALSE'
END as is_consecutive
,CASE
WHEN LAG(c.end_date) OVER (PARTITION BY c.client ORDER BY c.start_date) = c.start_date
THEN LAG(c.contract_id) OVER (PARTITION BY c.client ORDER BY c.start_date)
END as related_previous
,CASE
WHEN LEAD(c.start_date) OVER (PARTITION BY c.client ORDER BY c.start_date) = c.end_date
THEN LEAD(c.contract_id) OVER (PARTITION BY c.client ORDER BY c.start_date)
END as related_next
FROM
contract c
如果有超过 2 个,您的“相关 ID”将是一个挑战。这将变成一个间隙和孤岛问题,使用 lag()
和累积总和的最佳方法:
select t.*,
coalesce(prev_end_date = end_date, false) as is_consecutive,
array_agg(contract_id) over (partition by clent, grp order by start_date) as contracts
from (select sum(case when prev_end_date = end_date then 0 else 1 end) over (partition by client order by start_date) as grp
from (select t.*,
lag(end_date) over (partition by client order by start_date) as prev_end_date
from t
) t
) t;
注意:这会将所有相关合同放入一个数组中。如果愿意,您可以删除当前合同。
我有一个包含合同信息的 table,我想添加一个计算列,用于标识合同何时与同一客户的前一个合同连续。因此,当合同的结束日期与同一客户的下一份合同的开始日期相匹配时,我们认为这是连续的。
数据如下所示:
我希望它看起来像这样:
我尝试对合约 table 进行内部连接,然后将其合并,但我认为这不是最有效的方法。
您知道实现此目标的更好方法吗?
提前致谢。
以下代码已在 SQL Server 上测试,但也可以在 Presto 上运行。 SQL Server 有 bit
值而不是 boolean
s,所以我返回了 TRUE
和 FALSE
作为字符串.希望您能够进行任何必要的修改。
SELECT
c.contract_id
,c.client
,c.start_date
,c.end_date
,CASE
WHEN LAG(c.end_date) OVER (PARTITION BY c.client ORDER BY c.start_date) = c.start_date
THEN 'TRUE' ELSE 'FALSE'
END as is_consecutive
,CASE
WHEN LAG(c.end_date) OVER (PARTITION BY c.client ORDER BY c.start_date) = c.start_date
THEN LAG(c.contract_id) OVER (PARTITION BY c.client ORDER BY c.start_date)
END as related_previous
,CASE
WHEN LEAD(c.start_date) OVER (PARTITION BY c.client ORDER BY c.start_date) = c.end_date
THEN LEAD(c.contract_id) OVER (PARTITION BY c.client ORDER BY c.start_date)
END as related_next
FROM
contract c
如果有超过 2 个,您的“相关 ID”将是一个挑战。这将变成一个间隙和孤岛问题,使用 lag()
和累积总和的最佳方法:
select t.*,
coalesce(prev_end_date = end_date, false) as is_consecutive,
array_agg(contract_id) over (partition by clent, grp order by start_date) as contracts
from (select sum(case when prev_end_date = end_date then 0 else 1 end) over (partition by client order by start_date) as grp
from (select t.*,
lag(end_date) over (partition by client order by start_date) as prev_end_date
from t
) t
) t;
注意:这会将所有相关合同放入一个数组中。如果愿意,您可以删除当前合同。