寻找连续 4 年捐赠的客户(包括差距)
Find Customers With 4 Consecutive Years of Giving (Including Gaps)
我有一个 table 类似于以下内容:
+------------+-----------+
| CustomerID | OrderYear |
+------------+-----------+
| 1 | 2012 |
| 1 | 2013 |
| 1 | 2014 |
| 1 | 2017 |
| 1 | 2018 |
| 2 | 2012 |
| 2 | 2013 |
| 2 | 2014 |
| 2 | 2015 |
| 2 | 2017 |
+------------+-----------+
我如何确定哪些 CustomerID 连续 4 年捐赠? (上面只有客户2。)可以看到,有些记录在顺序年份会有差距。
我开始尝试使用 ROW_NUMBER
/LAG
/LEAD
的某种组合,但到目前为止运气不佳。
非常配对 down/modified 尝试...
WITH CTE
AS
(
SELECT T.ConstituentLookupID,
T.FISCALYEAR,
COUNT(T.FISCALYEAR) OVER (PARTITION BY T.ConstituentLookupID) AS
YearCount,
FIRST_VALUE(T.FISCALYEAR) OVER(PARTITION BY T.ConstituentLookupID ORDER
BY T.FISCALYEAR DESC) - T.FISCALYEAR + 1 as X,
ROW_NUMBER() OVER(PARTITION BY T.ConstituentLookupID ORDER BY
T.FISCALYEAR DESC) AS RN
FROM #Temp AS T)
SELECT CTE.ConstituentLookupID,
CTE.FISCALYEAR,
CTE.YearCount,
CTE.X,
CTE.RN,
FROM CTE
WHERE CTE.YearCount >= 4 --Have at least 4 years of giving
AND CTE.X - CTE.RN = 1 --Some kind of way to calculate consecutive years. Doesnt account current year and gaps...;
假设没有重复,你可以使用lag()
:
select distinct customerid
from (
select t.*,
lag(orderyear, 3) over(partition by customerid order by orderyear) oderyear3
from mytable t
) t
where orderyear = orderyear3 + 3
一种更传统的方法是使用一些间隙和孤岛技术。如果您想要每个系列的开始和结束,这很方便。在这里,一个岛是一系列具有“相邻”顺序年份的行,并且您需要至少 4 年长的岛。我们可以通过将顺序年份与递增序列进行比较来识别岛屿,然后使用聚合:
select customerid, min(orderyear) firstorderyear, max(orderyear) lastorderyear
from (
select t.*,
row_number() over(partition by customerid order by orderyear) rn
from mytable t
) t
group by customerid, orderyear - rn
having count(*) >= 4
我有一个使用行号和分组依据的简单解决方案
SELECT Max(z.customerid),
Count(z.grp)
FROM (SELECT customerid,
orderyear,
orderyear - Row_number()
OVER (
ORDER BY customerid) AS Grp
FROM mytable)z
GROUP BY z.grp
HAVING Count(z.grp) = 4
假设每个客户和年份不超过一行,最简单的方法是 lag()
:
select customerid, year
from (select t.*,
lag(orderyear, 3) over (partition by customerid order by orderyear) as prev3_year
from t
) t
where prev3_year = year - 3;
这个想法是回顾 3 年前。如果那一年是第 3 年,则连续四年。如果您的数据可以有重复项,则可以对逻辑进行调整(它们会使查询稍微复杂一些)。
这可能 return 重复,所以您可能只想:
select distinct customerid
from (select t.*,
lag(orderyear, 3) over (partition by customerid order by orderyear) as prev3_year
from t
) t
where prev3_year = year - 3;
我有一个 table 类似于以下内容:
+------------+-----------+
| CustomerID | OrderYear |
+------------+-----------+
| 1 | 2012 |
| 1 | 2013 |
| 1 | 2014 |
| 1 | 2017 |
| 1 | 2018 |
| 2 | 2012 |
| 2 | 2013 |
| 2 | 2014 |
| 2 | 2015 |
| 2 | 2017 |
+------------+-----------+
我如何确定哪些 CustomerID 连续 4 年捐赠? (上面只有客户2。)可以看到,有些记录在顺序年份会有差距。
我开始尝试使用 ROW_NUMBER
/LAG
/LEAD
的某种组合,但到目前为止运气不佳。
非常配对 down/modified 尝试...
WITH CTE
AS
(
SELECT T.ConstituentLookupID,
T.FISCALYEAR,
COUNT(T.FISCALYEAR) OVER (PARTITION BY T.ConstituentLookupID) AS
YearCount,
FIRST_VALUE(T.FISCALYEAR) OVER(PARTITION BY T.ConstituentLookupID ORDER
BY T.FISCALYEAR DESC) - T.FISCALYEAR + 1 as X,
ROW_NUMBER() OVER(PARTITION BY T.ConstituentLookupID ORDER BY
T.FISCALYEAR DESC) AS RN
FROM #Temp AS T)
SELECT CTE.ConstituentLookupID,
CTE.FISCALYEAR,
CTE.YearCount,
CTE.X,
CTE.RN,
FROM CTE
WHERE CTE.YearCount >= 4 --Have at least 4 years of giving
AND CTE.X - CTE.RN = 1 --Some kind of way to calculate consecutive years. Doesnt account current year and gaps...;
假设没有重复,你可以使用lag()
:
select distinct customerid
from (
select t.*,
lag(orderyear, 3) over(partition by customerid order by orderyear) oderyear3
from mytable t
) t
where orderyear = orderyear3 + 3
一种更传统的方法是使用一些间隙和孤岛技术。如果您想要每个系列的开始和结束,这很方便。在这里,一个岛是一系列具有“相邻”顺序年份的行,并且您需要至少 4 年长的岛。我们可以通过将顺序年份与递增序列进行比较来识别岛屿,然后使用聚合:
select customerid, min(orderyear) firstorderyear, max(orderyear) lastorderyear
from (
select t.*,
row_number() over(partition by customerid order by orderyear) rn
from mytable t
) t
group by customerid, orderyear - rn
having count(*) >= 4
我有一个使用行号和分组依据的简单解决方案
SELECT Max(z.customerid),
Count(z.grp)
FROM (SELECT customerid,
orderyear,
orderyear - Row_number()
OVER (
ORDER BY customerid) AS Grp
FROM mytable)z
GROUP BY z.grp
HAVING Count(z.grp) = 4
假设每个客户和年份不超过一行,最简单的方法是 lag()
:
select customerid, year
from (select t.*,
lag(orderyear, 3) over (partition by customerid order by orderyear) as prev3_year
from t
) t
where prev3_year = year - 3;
这个想法是回顾 3 年前。如果那一年是第 3 年,则连续四年。如果您的数据可以有重复项,则可以对逻辑进行调整(它们会使查询稍微复杂一些)。
这可能 return 重复,所以您可能只想:
select distinct customerid
from (select t.*,
lag(orderyear, 3) over (partition by customerid order by orderyear) as prev3_year
from t
) t
where prev3_year = year - 3;