SQL 计算连胜的用户
SQL count users with winning streaks
我是 SQL 的新手,一直在忙于一些小项目,以了解如何将其用于大规模统计。
我目前正在处理的问题只是统计一个时间段内出现连胜的用户数。
给定一个table格式
user_id,date_time,user_team,win_team
ab542a,2018-01-02 18:45:25,team1,team1
ef72da,2018-01-02 08:20:01,team2,team1
f5c776,2017-12-30 15:25:25,team1,team2
5a278a,2018-01-01 14:27:15,team2,team2
ae346d,2018-01-01 14:27:15,team2,team2
2b13d8,2017-12-31 12:33:34,team1,team2
ace797,2018-01-02 08:20:01,team2,team2
ace797,2018-01-03 18:18:22,team1,team2
ab542a,2018-01-03 18:45:25,team1,team1
ef72da,2018-01-03 08:20:01,team2,team1
f5c776,2017-12-31 15:25:25,team1,team2
5a278a,2018-01-02 14:27:15,team2,team2
ae346d,2018-01-02 14:27:15,team2,team2
2b13d8,2018-01-01 12:33:34,team1,team2
ace797,2018-01-03 08:20:01,team1,team1
ace797,2018-01-04 18:18:22,team1,team1
ab542a,2018-01-04 18:45:25,team1,team1
ef72da,2018-01-04 08:20:01,team2,team1
f5c776,2018-01-01 15:25:25,team1,team2
5a278a,2018-01-03 14:27:15,team2,team2
ae346d,2018-01-03 14:27:15,team2,team2
2b13d8,2018-01-02 12:33:34,team1,team2
ace797,2018-01-04 08:20:01,team2,team2
ace797,2018-01-05 18:18:22,team1,team1
其中 user 是用户 ID,date 是比赛日期,team 是用户效力的球队,winner 是比赛的获胜球队。我如何计算所有连续获胜的用户(至少连续 3 次获胜)?
此外,假设我还想在同一个 table 中跟踪正在玩的游戏(国际象棋、西洋双陆棋等),是否有可能在同一个查询中跟踪多个游戏中的连胜?
在 python 中,这可以通过对用户 ID 进行相对简单的循环来实现,但计算量大且可能无法很好地扩展
如果您希望用户至少连胜 3 场,您可以使用 window 函数,如下所示:
select count(distinct t.user)
from (select t.*,
lead(date, 2) over (partition by user order by date) as date_2,
lead(date, 2) over (partition by user, (case when team = winner then 'win' else 'lose' end
order by date
) date_same_2
from t
where date >= ? and date < ?
) t
where team = winner;
这样做是通过两个标准检查用户前面 2 行的行。第一个只是按日期。第二种是当用户的团队获胜时。如果它们相同——并且当前行是获胜行——那么你就获得了三连胜。
我是 SQL 的新手,一直在忙于一些小项目,以了解如何将其用于大规模统计。
我目前正在处理的问题只是统计一个时间段内出现连胜的用户数。
给定一个table格式
user_id,date_time,user_team,win_team
ab542a,2018-01-02 18:45:25,team1,team1
ef72da,2018-01-02 08:20:01,team2,team1
f5c776,2017-12-30 15:25:25,team1,team2
5a278a,2018-01-01 14:27:15,team2,team2
ae346d,2018-01-01 14:27:15,team2,team2
2b13d8,2017-12-31 12:33:34,team1,team2
ace797,2018-01-02 08:20:01,team2,team2
ace797,2018-01-03 18:18:22,team1,team2
ab542a,2018-01-03 18:45:25,team1,team1
ef72da,2018-01-03 08:20:01,team2,team1
f5c776,2017-12-31 15:25:25,team1,team2
5a278a,2018-01-02 14:27:15,team2,team2
ae346d,2018-01-02 14:27:15,team2,team2
2b13d8,2018-01-01 12:33:34,team1,team2
ace797,2018-01-03 08:20:01,team1,team1
ace797,2018-01-04 18:18:22,team1,team1
ab542a,2018-01-04 18:45:25,team1,team1
ef72da,2018-01-04 08:20:01,team2,team1
f5c776,2018-01-01 15:25:25,team1,team2
5a278a,2018-01-03 14:27:15,team2,team2
ae346d,2018-01-03 14:27:15,team2,team2
2b13d8,2018-01-02 12:33:34,team1,team2
ace797,2018-01-04 08:20:01,team2,team2
ace797,2018-01-05 18:18:22,team1,team1
其中 user 是用户 ID,date 是比赛日期,team 是用户效力的球队,winner 是比赛的获胜球队。我如何计算所有连续获胜的用户(至少连续 3 次获胜)?
此外,假设我还想在同一个 table 中跟踪正在玩的游戏(国际象棋、西洋双陆棋等),是否有可能在同一个查询中跟踪多个游戏中的连胜?
在 python 中,这可以通过对用户 ID 进行相对简单的循环来实现,但计算量大且可能无法很好地扩展
如果您希望用户至少连胜 3 场,您可以使用 window 函数,如下所示:
select count(distinct t.user)
from (select t.*,
lead(date, 2) over (partition by user order by date) as date_2,
lead(date, 2) over (partition by user, (case when team = winner then 'win' else 'lose' end
order by date
) date_same_2
from t
where date >= ? and date < ?
) t
where team = winner;
这样做是通过两个标准检查用户前面 2 行的行。第一个只是按日期。第二种是当用户的团队获胜时。如果它们相同——并且当前行是获胜行——那么你就获得了三连胜。