如何使用 MYSQL 获取每个用户的最后 n 天分数变化以及多列的排名?

How to get the last n day score change for each user along with rank for multiple columns using MYSQL?

我有一个 MYSQL 数据库,用于跟踪所有用户的每日总分(以及一些其他类似 score/count 类型的指标,如“badgesEarned”,我在这里只包含 2 个字段我需要跟踪的 5 个中的一个)。它仅包含用户活跃日期(获得分数或徽章)的数据。所以数据库不会有每个日期的数据。

这是一个玩具示例: Example Database Table: "User"

现在我的目标是获取每个用户最近 7 天的分数变化(我还需要做最后 30 天和 365 天,但在本例中我们只使用 7 天)。由于数据库 table 存储了每个用户所有活跃天数的总分快照,我写了一个 SQL 查询来找到两个合适的 rows/snapshots 并得到 [=47= 中的差异] 它们之间。这两行将是当前日期行(或者如果不存在,则使用它之前的行)与第 (current_date - 7) 行(或者如果不存在,则使用之前的行)它)。

更糟糕的是,我还必须通过 dense_rank() SQL 方法跟踪每个玩家的“等级”,并将其作为列添加到最终结果中table.

到目前为止,我可以通过 2 种不同的 SQL 查询实现此目的。

我的主要问题是 - 就performance/good practice/efficiency而言,其中一个比另一个“更好”吗?还是它们都很可怕,而我一开始就完全走错了路,完全错过了更有效的方法?我对 SQL 的东西不是很好,所以如果问题和代码示例令人恐惧,请提前道歉:

第一种方法: 仅使用多个嵌套子查询(无连接)。

SELECT *, dense_rank() OVER (ORDER BY t3.score DESC) AS ranking
FROM
(
  SELECT t1.userId, 
                
                                (SELECT t2.score
                                FROM User t2 
                                WHERE t2.date <= CURDATE() AND t2.userId=t1.userId
                                ORDER BY t2.date DESC LIMIT 1)
                                - 
                                (SELECT t2.score
                                FROM User t2 
                                WHERE t2.date <= DATE_ADD(CURDATE(), INTERVAL - 7 DAY) AND t2.userId=t1.userId
                                ORDER BY t2.date DESC LIMIT 1) as score,
                
                                (SELECT t2.badgesEarned
                                FROM User t2 
                                WHERE t2.date <= CURDATE() AND t2.userId=t1.userId
                                ORDER BY t2.date DESC LIMIT 1)
                                - 
                                (SELECT t2.badgesEarned
                                FROM User t2 
                                WHERE t2.date <= DATE_ADD(CURDATE(), INTERVAL - 7 DAY) AND t2.userId=t1.userId
                                ORDER BY t2.date DESC LIMIT 1) as badgesEarned
                
    FROM User t1
    GROUP BY t1.userId) t3

第二种方法: 为每个日期点获取 2 个单独的 table,然后执行内部联接以减去相关列。

SELECT *, dense_rank() OVER (ORDER BY T0.score_delta DESC) AS ranking
FROM
(SELECT T1.userId,
        (T1.score - T2.score),
        (T1.badgesEarned - T2.badgesEarned)

FROM 

(select *
from (
   select *, row_number() over (partition by userId order by date desc) as ranking
   from User
   where date<=date_add(CURDATE(),interval -7 day)
) t
where t.ranking = 1) as T2

INNER JOIN

(select *
from (
   select *, row_number() over (partition by userId order by date desc) as ranking
   from User
   where date<=CURDATE()
) t
where t.ranking = 1) as T1

on T1.userId= T2.userId ) T0

附带问题: 我的一位同事建议我在代码本身中处理列减法 - 例如,我会调用数据库两次,得到两个 tables(一个用于 CURDATE(),另一个用于 CURDATE-7),然后遍历所有用户对象并减去相关字段以构建我的最终结果列表。我不确定这是否是更好的方法,所以我应该这样做而不是通过 SQL 方式处理它吗?

如果您想玩弄虚拟数据,这里是数据库的 SQL小提琴:http://sqlfiddle.com/#!9/86c58f0/1

此外,上面的两个代码段 运行 在我的 MySQL 8.0 workbench 上没问题。

我不太明白你的预期结果。但是你能不能只使用 window 函数,结合 RANGE 子句?

我只是在创建中央backbonetable,然后由你来减去你需要相互减去的东西,最后到dense_rank () 你需要dense_rank()。基本上,我认为您需要将包含 DENSE_RANK() 的最终 select 放入我的 with_a_week_before in-line table.[=13 中的 select =]

WITH                                                                                                 
-- your input
usr(userid,dt,score,badgesearned) AS (
          SELECT 1234,DATE '2020-08-06', 100, 10
UNION ALL SELECT 1234,DATE '2020-08-07', 120, 12
UNION ALL SELECT 1234,DATE '2020-08-08', 130, 13
UNION ALL SELECT 1234,DATE '2020-08-12', 140, 14
UNION ALL SELECT 1234,DATE '2020-08-14', 150, 15
UNION ALL SELECT  100,DATE '2020-08-05', 100, 10
UNION ALL SELECT  100,DATE '2020-08-10', 100, 10
UNION ALL SELECT  100,DATE '2020-08-14', 200, 10
UNION ALL SELECT    1,DATE '2020-08-05', 140, 14
UNION ALL SELECT    1,DATE '2020-08-08', 145, 14
UNION ALL SELECT    1,DATE '2020-08-12', 150, 15
)
,
with_a_week_before AS (
  SELECT 
    *
  , FIRST_VALUE(score) OVER(
      PARTITION BY userid ORDER BY dt
      RANGE BETWEEN INTERVAL '7 DAYS' PRECEDING AND CURRENT ROW
    ) AS score_a_week
  , FIRST_VALUE(badgesearned) OVER(
      PARTITION BY userid ORDER BY dt
      RANGE BETWEEN INTERVAL '7 DAYS' PRECEDING AND CURRENT ROW
    ) AS badgesearned_a_week
  , FIRST_VALUE(dt) OVER( -- check the date of the previous row
      PARTITION BY userid ORDER BY dt
      RANGE BETWEEN INTERVAL '7 DAYS' PRECEDING AND CURRENT ROW
    ) AS dt_a_week
  FROM usr
)
SELECT * FROM with_a_week_before ORDER BY userid
-- out  userid |     dt     | score | badgesearned | score_a_week | badgesearned_a_week | dt_a_week  
-- out --------+------------+-------+--------------+--------------+---------------------+------------
-- out       1 | 2020-08-05 |   140 |           14 |          140 |                  14 | 2020-08-05
-- out       1 | 2020-08-08 |   145 |           14 |          140 |                  14 | 2020-08-05
-- out       1 | 2020-08-12 |   150 |           15 |          140 |                  14 | 2020-08-05
-- out     100 | 2020-08-05 |   100 |           10 |          100 |                  10 | 2020-08-05
-- out     100 | 2020-08-10 |   100 |           10 |          100 |                  10 | 2020-08-05
-- out     100 | 2020-08-14 |   200 |           10 |          100 |                  10 | 2020-08-10
-- out    1234 | 2020-08-06 |   100 |           10 |          100 |                  10 | 2020-08-06
-- out    1234 | 2020-08-07 |   120 |           12 |          100 |                  10 | 2020-08-06
-- out    1234 | 2020-08-08 |   130 |           13 |          100 |                  10 | 2020-08-06
-- out    1234 | 2020-08-12 |   140 |           14 |          100 |                  10 | 2020-08-06
-- out    1234 | 2020-08-14 |   150 |           15 |          120 |                  12 | 2020-08-07