计算平均时间间隔长度
Calculating average time interval length
我准备了一个简单的 SQL Fiddle 演示我的问题 -
在PostgreSQL 10.3中我将用户信息、双人对局和走法存储在以下3个表中:
CREATE TABLE players (
uid SERIAL PRIMARY KEY,
name text NOT NULL
);
CREATE TABLE games (
gid SERIAL PRIMARY KEY,
player1 integer NOT NULL REFERENCES players ON DELETE CASCADE,
player2 integer NOT NULL REFERENCES players ON DELETE CASCADE
);
CREATE TABLE moves (
mid BIGSERIAL PRIMARY KEY,
uid integer NOT NULL REFERENCES players ON DELETE CASCADE,
gid integer NOT NULL REFERENCES games ON DELETE CASCADE,
played timestamptz NOT NULL
);
假设 Alice 和 Bob 这 2 位玩家已经玩了 3 场比赛:
INSERT INTO players (name) VALUES ('Alice'), ('Bob');
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);
让我们假设第 1 场比赛进行得很快,每分钟都在下棋。
但后来他们冷静了 :-) 玩了 2 场慢速游戏,每 10 分钟移动一次:
INSERT INTO moves (uid, gid, played) VALUES
(1, 1, now() + interval '1 min'),
(2, 1, now() + interval '2 min'),
(1, 1, now() + interval '3 min'),
(2, 1, now() + interval '4 min'),
(1, 1, now() + interval '5 min'),
(2, 1, now() + interval '6 min'),
(1, 2, now() + interval '10 min'),
(2, 2, now() + interval '20 min'),
(1, 2, now() + interval '30 min'),
(2, 2, now() + interval '40 min'),
(1, 2, now() + interval '50 min'),
(2, 2, now() + interval '60 min'),
(1, 3, now() + interval '110 min'),
(2, 3, now() + interval '120 min'),
(1, 3, now() + interval '130 min'),
(2, 3, now() + interval '140 min'),
(1, 3, now() + interval '150 min'),
(2, 3, now() + interval '160 min');
在包含游戏统计信息的网页上,我想显示每个玩家移动之间的平均时间。
所以我想我必须使用 PostgreSQL 的 LAG window function。
由于可以同时玩几个游戏,所以我尝试PARTITION BY gid
(即"game id")。
不幸的是,我收到一个语法错误 window 函数调用无法嵌套 我的 SQL 查询:
SELECT AVG(played - LAG(played) OVER (PARTITION BY gid order by played))
OVER (PARTITION BY gid order by played)
FROM moves
-- trying to calculate average thinking time for player Alice
WHERE uid = 1;
更新:
由于我的数据库中的游戏数量很大并且每天都在增长,我尝试(这里是新的SQL Fiddle)在内部select查询中添加一个条件:
SELECT AVG(played - prev_played)
FROM (SELECT m.*,
LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
FROM moves m
JOIN games g ON (m.uid in (g.player1, g.player2))
WHERE m.played > now() - interval '1 month'
) m
WHERE uid = 1;
然而,出于某种原因,这将 returned 值从根本上改变为 1 分 45 秒。
我想知道,为什么内部 SELECT 查询突然 return 更多行,可能是我的 JOIN 中缺少某些条件?
更新 2:
哦,好的,我明白为什么平均值会降低:通过具有相同时间戳的多行(即 played - prev_played = 0
),但是如何修复 JOIN?
更新 3:
没关系,我在 SQL JOIN 中缺少 m.gid = g.gid AND
条件,now it works:
SELECT AVG(played - prev_played)
FROM (SELECT m.*,
LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
FROM moves m
JOIN games g ON (m.gid = g.gid AND m.uid in (g.player1, g.player2))
WHERE m.played > now() - interval '1 month'
) m
WHERE uid = 1;
您需要子查询来嵌套 window 函数。我认为这符合您的要求:
select avg(played - prev_played)
from (select m.*,
lag(m.played) over (partition by gid order by played) as prev_played
from moves m
) m
where uid = 1;
注意:where
需要进入外层查询,所以不影响lag()
。
@gordon 的回答可能已经足够好了。但这不是您在评论中要求的结果。之所以有效,是因为每场比赛的数据行数相同,因此比赛的平均值与完整平均值相同。但是如果你想要游戏的平均水平,你需要一个额外的级别。
With cte as (
SELECT gid, AVG(played - prev_played) as play_avg
FROM (select m.*,
lag(m.played) over (partition by gid order by played) as prev_played
from moves m
) m
WHERE uid = 1
GROUP BY gid
)
SELECT AVG(play_avg)
FROM cte
;
我准备了一个简单的 SQL Fiddle 演示我的问题 -
在PostgreSQL 10.3中我将用户信息、双人对局和走法存储在以下3个表中:
CREATE TABLE players (
uid SERIAL PRIMARY KEY,
name text NOT NULL
);
CREATE TABLE games (
gid SERIAL PRIMARY KEY,
player1 integer NOT NULL REFERENCES players ON DELETE CASCADE,
player2 integer NOT NULL REFERENCES players ON DELETE CASCADE
);
CREATE TABLE moves (
mid BIGSERIAL PRIMARY KEY,
uid integer NOT NULL REFERENCES players ON DELETE CASCADE,
gid integer NOT NULL REFERENCES games ON DELETE CASCADE,
played timestamptz NOT NULL
);
假设 Alice 和 Bob 这 2 位玩家已经玩了 3 场比赛:
INSERT INTO players (name) VALUES ('Alice'), ('Bob');
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);
让我们假设第 1 场比赛进行得很快,每分钟都在下棋。
但后来他们冷静了 :-) 玩了 2 场慢速游戏,每 10 分钟移动一次:
INSERT INTO moves (uid, gid, played) VALUES
(1, 1, now() + interval '1 min'),
(2, 1, now() + interval '2 min'),
(1, 1, now() + interval '3 min'),
(2, 1, now() + interval '4 min'),
(1, 1, now() + interval '5 min'),
(2, 1, now() + interval '6 min'),
(1, 2, now() + interval '10 min'),
(2, 2, now() + interval '20 min'),
(1, 2, now() + interval '30 min'),
(2, 2, now() + interval '40 min'),
(1, 2, now() + interval '50 min'),
(2, 2, now() + interval '60 min'),
(1, 3, now() + interval '110 min'),
(2, 3, now() + interval '120 min'),
(1, 3, now() + interval '130 min'),
(2, 3, now() + interval '140 min'),
(1, 3, now() + interval '150 min'),
(2, 3, now() + interval '160 min');
在包含游戏统计信息的网页上,我想显示每个玩家移动之间的平均时间。
所以我想我必须使用 PostgreSQL 的 LAG window function。
由于可以同时玩几个游戏,所以我尝试PARTITION BY gid
(即"game id")。
不幸的是,我收到一个语法错误 window 函数调用无法嵌套 我的 SQL 查询:
SELECT AVG(played - LAG(played) OVER (PARTITION BY gid order by played))
OVER (PARTITION BY gid order by played)
FROM moves
-- trying to calculate average thinking time for player Alice
WHERE uid = 1;
更新:
由于我的数据库中的游戏数量很大并且每天都在增长,我尝试(这里是新的SQL Fiddle)在内部select查询中添加一个条件:
SELECT AVG(played - prev_played)
FROM (SELECT m.*,
LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
FROM moves m
JOIN games g ON (m.uid in (g.player1, g.player2))
WHERE m.played > now() - interval '1 month'
) m
WHERE uid = 1;
然而,出于某种原因,这将 returned 值从根本上改变为 1 分 45 秒。
我想知道,为什么内部 SELECT 查询突然 return 更多行,可能是我的 JOIN 中缺少某些条件?
更新 2:
哦,好的,我明白为什么平均值会降低:通过具有相同时间戳的多行(即 played - prev_played = 0
),但是如何修复 JOIN?
更新 3:
没关系,我在 SQL JOIN 中缺少 m.gid = g.gid AND
条件,now it works:
SELECT AVG(played - prev_played)
FROM (SELECT m.*,
LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
FROM moves m
JOIN games g ON (m.gid = g.gid AND m.uid in (g.player1, g.player2))
WHERE m.played > now() - interval '1 month'
) m
WHERE uid = 1;
您需要子查询来嵌套 window 函数。我认为这符合您的要求:
select avg(played - prev_played)
from (select m.*,
lag(m.played) over (partition by gid order by played) as prev_played
from moves m
) m
where uid = 1;
注意:where
需要进入外层查询,所以不影响lag()
。
@gordon 的回答可能已经足够好了。但这不是您在评论中要求的结果。之所以有效,是因为每场比赛的数据行数相同,因此比赛的平均值与完整平均值相同。但是如果你想要游戏的平均水平,你需要一个额外的级别。
With cte as (
SELECT gid, AVG(played - prev_played) as play_avg
FROM (select m.*,
lag(m.played) over (partition by gid order by played) as prev_played
from moves m
) m
WHERE uid = 1
GROUP BY gid
)
SELECT AVG(play_avg)
FROM cte
;