添加具有滚动计算分组依据的列

Add column with rolling calculation group by

我有一个 table 这样的:

current_date user_id mode_name mode_time
2021-10-01 1 game 10
2021-10-02 1 game 10
2021-10-02 1 tv 30
2021-10-09 1 music 10
2021-10-15 1 music 40
2021-10-01 2 music 10
2021-10-01 2 game 10
2021-10-04 2 game 10
2021-10-04 2 music 20
2021-10-05 2 tv 40
2021-10-11 2 tv 40
2021-10-12 2 game 20

我想添加两列:

  1. 最喜欢的列 mode_name,根据每个 user_id
  2. mode_time 列的累计总和
  3. 每个 user_id
  4. 收藏夹 mode_namemode_time 列的累计总和的列

所需的 table 应如下所示:

current_date user_id mode_name mode_time favourite_mode favourite_mode_time
2021-10-01 1 game 10 game 10
2021-10-02 1 game 10 tv 30
2021-10-02 1 tv 30 tv 30
2021-10-09 1 music 10 tv 30
2021-10-15 1 music 40 music 50
2021-10-01 2 music 10 game 10
2021-10-01 2 game 10 game 10
2021-10-04 2 game 10 music 30
2021-10-04 2 music 20 music 30
2021-10-05 2 tv 40 tv 40
2021-10-11 2 tv 40 tv 80
2021-10-12 2 game 20 tv 80

Table 可以在这里找到 https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=e05302a2cfd81a2a55de811e294f513e

您可以使用 max 与用户和模式分区来计算模式的滚动总和,然后使用 maxmax_by 在外部 select 中获取相应的值:

-- sample data
WITH dataset (date, user_id, mode_name, mode_time) AS (
    values ('2021-10-01', 1, 'game', 10),
        ('2021-10-02', 1, 'game', 10),
        ('2021-10-02', 1, 'tv', 30),
        ('2021-10-09', 1, 'music', 10),
        ('2021-10-15', 1, 'music', 40),
        ('2021-10-01', 2, 'game', 10),
        ('2021-10-01', 2, 'music', 10),
        ('2021-10-04', 2, 'game', 10),
        ('2021-10-04', 2, 'music', 20),
        ('2021-10-05', 2, 'tv', 40),
        ('2021-10-11', 2, 'tv', 40),
        ('2021-10-12', 2, 'game', 20)
) 

--query
SELECT date, user_id, mode_name, mode_time,
    max_by(mode_name, mode_time_rolling_time) OVER (
        PARTITION BY user_id
        ORDER BY date
    ) AS favourite_mode,
    max(mode_time_rolling_time) OVER (
        PARTITION BY user_id
        ORDER BY date
    ) AS favourite_mode_time
FROM(
        SELECT *,
            sum(mode_time) OVER (
                PARTITION BY user_id,
                mode_name
                ORDER BY date
            ) AS mode_time_rolling_time
        FROM dataset
    )
ORDER BY user_id, date

输出:

date user_id mode_name mode_time favourite_mode favourite_mode_time
2021-10-01 1 game 10 game 10
2021-10-02 1 game 10 tv 30
2021-10-02 1 tv 30 tv 30
2021-10-09 1 music 10 tv 30
2021-10-15 1 music 40 music 50
2021-10-01 2 game 10 game 10
2021-10-01 2 music 10 game 10
2021-10-04 2 music 20 music 30
2021-10-04 2 game 10 music 30
2021-10-05 2 tv 40 tv 40
2021-10-11 2 tv 40 tv 80
2021-10-12 2 game 20 tv 80