SQL 中重复记录的问题
Issue with the repeated records in SQL
我的数据集如下所示:
每当团队发生变化时,我都在尝试获取员工的最小开始日期和最大结束日期。
这里的问题是,重复团队的日期未到。
任何帮助将不胜感激..
如果我没理解错的话,这是一个可以用行号差来解决的孤岛问题。
您可以使用:
select emp_id, team, min(month_end_date), max(month_end_date)
from (select t.*,
row_number() over (partition by emp_id order by month_end_date) as seqnum,
row_number() over (partition by emp_id, team order by month_end_date) as seqnum_2
from t
) t
group by emp_id, team, (seqnum - seqnum_2);
注意:这会将日期放在一行中,这似乎比您预期的结果更有用。
Teradata 有一个很好的 SQL 扩展,用于标准化重叠日期范围。这假设您希望在缺少月份时获得额外的行,即存在差距:
SELECT
emp_id
,team
-- split the Period into seperate columns again
,Begin(pd)
,last_day(add_months(End(pd),-1)) -- end of previous month
FRO
(
SELECT NORMALIZE -- normalize overlapping periods
emp_id
,team
-- NORMALIZE only works with periods, so create a Period based on current date plus one month
,PERIOD(month_end_date
,last_day(add_months(month_end_date, 1))
) AS pd
FROM vt
) AS dt;
我的数据集如下所示:
每当团队发生变化时,我都在尝试获取员工的最小开始日期和最大结束日期。
这里的问题是,重复团队的日期未到。
任何帮助将不胜感激..
如果我没理解错的话,这是一个可以用行号差来解决的孤岛问题。
您可以使用:
select emp_id, team, min(month_end_date), max(month_end_date)
from (select t.*,
row_number() over (partition by emp_id order by month_end_date) as seqnum,
row_number() over (partition by emp_id, team order by month_end_date) as seqnum_2
from t
) t
group by emp_id, team, (seqnum - seqnum_2);
注意:这会将日期放在一行中,这似乎比您预期的结果更有用。
Teradata 有一个很好的 SQL 扩展,用于标准化重叠日期范围。这假设您希望在缺少月份时获得额外的行,即存在差距:
SELECT
emp_id
,team
-- split the Period into seperate columns again
,Begin(pd)
,last_day(add_months(End(pd),-1)) -- end of previous month
FRO
(
SELECT NORMALIZE -- normalize overlapping periods
emp_id
,team
-- NORMALIZE only works with periods, so create a Period based on current date plus one month
,PERIOD(month_end_date
,last_day(add_months(month_end_date, 1))
) AS pd
FROM vt
) AS dt;