如何在 OVER() window 中获取相邻值

How to get adjacent value in an OVER() window

我有以下数据和查询以获取 MAX(wins) 到当前季节的季节:

WITH results as (
    SELECT 'DAL' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2011 as season, 10 as wins union
    SELECT 'DET' as team, 2012 as season, 4 as wins union
    SELECT 'DET' as team, 2013 as season, 7 as wins union
    SELECT 'DET' as team, 2014 as season, 11 as wins
) SELECT team, season, wins
    ,MAX(wins) OVER (PARTITION BY team ORDER BY season ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) max_wins_thus_far
FROM results;

# team, season, wins, max_wins_thus_far
DAL, 2010, 6, 6
DET, 2010, 6, 6
DET, 2011, 10, 10
DET, 2012, 4, 10
DET, 2013, 7, 10
DET, 2014, 11, 11

在这里我们可以看到,例如,对于 DET,2011 年的最大获胜次数是 10,因此“max_wins”列是 10从 2011 年到 2014 年,当它取 11 的较大值时。但是,我想把 赛季 的总胜利数提高到那个时候。例如,结果如下所示:

# team, season, wins, max_wins_thus_far, season_with_max_wins_thus_far
DAL, 2010, 6, 6, 2010
DET, 2010, 6, 6, 2010
DET, 2011, 10, 10, 2011 <-- 2011 has the most wins for DET
DET, 2012, 4, 10, 2011
DET, 2013, 7, 10, 2011
DET, 2014, 11, 11, 2014 <-- now 2014 is the season with the most wins...

如何在解析函数中完成此操作?我能做的最好的就是用数据构建一个对象,但不确定从那里去哪里:

# team, season, wins, max_wins_thus_far
DAL, 2010, 6, {"2010": 6}
DET, 2010, 6, {"2010": 6}
DET, 2011, 10, {"2010": 6, "2011": 10}
DET, 2012, 4, {"2010": 6, "2011": 10, "2012": 4}
DET, 2013, 7, {"2010": 6, "2011": 10, "2012": 4, "2013": 7}
DET, 2014, 11, {"2010": 6, "2011": 10, "2012": 4, "2013": 7, "2014": 11}

我们可以使用一些间隙和孤岛技术:这个想法是构建具有 window 总和的“相邻”记录组,每次满足更大的 win 时递增比所有前面的值。然后我们可以用一个window min() 来恢复对应的季节(基本上就是每个岛的开始)。

select team, season, wins, 
    greatest(wins, max_wins_1) max_wins_thus_far,
    min(season) over(partition by team, grp order by season) as season_with_max_wins_thus_far
from (
    select r.*,
        sum(case when wins > max_wins_1 then 1 else 0 end) 
            over(partition by team order by season) as grp
    from (
        select r.*,
            max(wins) over (
                partition by team 
                order by season 
                rows between unbounded preceding and 1 preceding
            ) as max_wins_1
        from results r
    ) r
) r

另一种方法是关联子查询:

select team, season, wins, 
    max(wins) over(partition by team order by season) as max_wins_thus_far,
    (
        select r1.season
        from results r1 
        where r1.team = r.team and r1.season <= r.season
        order by r1.wins desc, r1.season
        limit 1
    ) as season_with_max_wins_thus_far
from results r

Demo on DB Fiddlde - 两个查询都产生:

team | season | wins | max_wins_thus_far | season_with_max_wins_thus_far
:--- | -----: | ---: | ----------------: | ----------------------------:
DAL  |   2010 |    6 |                 6 |                          2010
DET  |   2010 |    6 |                 6 |                          2010
DET  |   2011 |   10 |                10 |                          2011
DET  |   2012 |    4 |                10 |                          2011
DET  |   2013 |    7 |                10 |                          2011
DET  |   2014 |   11 |                11 |                          2014

这当然是最 hack-iest 的方法,但鉴于赛季和胜利都是数字,我们可以将它们加在一起并获得它们的最大值(类似于 2024 添加 201014 一起)然后通过减去 max_wins 到那个点来检索季节。这是一个例子:

WITH results as (
    SELECT 'DAL' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2011 as season, 10 as wins union
    SELECT 'DET' as team, 2012 as season, 4 as wins union
    SELECT 'DET' as team, 2013 as season, 7 as wins union
    SELECT 'DET' as team, 2014 as season, 11 as wins
) 
SELECT team, season, wins,
    max(wins) OVER through_current AS max_wins_thus_far
   ,max(wins + season) OVER through_current - max(wins) OVER through_current AS season_with_max_wins_thus_far
FROM results
WINDOW through_current AS (PARTITION BY team ORDER BY season ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

# team, season, wins, max_wins_thus_far, season_with_max_wins_thus_far
DAL, 2010, 6, 6, 2010
DET, 2010, 6, 6, 2010
DET, 2011, 10, 10, 2011
DET, 2012, 4, 10, 2011
DET, 2013, 7, 10, 2011
DET, 2014, 11, 11, 2014

另一种方法是按季节 <= current_season 和团队 = 团队进行相关子查询过滤。例如:

) SELECT *,
    (SELECT season FROM results AS r_inner
     WHERE r_inner.season <= results.season AND r_inner.team = results.team
     ORDER BY WINS DESC LIMIT 1) best_season
 FROM results;

您可以使用二级 window 功能。只抓取最近的赛季,其中胜场为最大胜场:

SELECT r.*,
       MAX(CASE WHEN wins = max_wins_thus_far THEN season END) OVER (PARTITION BY team ORDER BY season) as max_season
FROM (SELECT team, season, wins,
             MAX(wins) OVER (PARTITION BY team ORDER BY season) as max_wins_thus_far
      FROM results
     ) r;

Here 是一个 db<>fiddle.