SQL 获取按 table 中另一列分组的给定日期之间的日期范围
SQL get range of dates in between given dates grouped by another column in a table
在此table -
----------------------------------------------
ID | user | type | timestamp
----------------------------------------------
1 | 1 | 1 | 2019-02-08 15:00:00
2 | 1 | 3 | 2019-02-15 15:00:00
3 | 1 | 2 | 2019-03-06 15:00:00
4 | 2 | 3 | 2019-02-01 15:00:00
5 | 2 | 1 | 2019-02-06 15:00:00
6 | 3 | 1 | 2019-01-10 15:00:00
7 | 3 | 4 | 2019-02-08 15:00:00
8 | 3 | 3 | 2019-02-24 15:00:00
9 | 3 | 2 | 2019-03-04 15:00:00
10 | 3 | 3 | 2019-03-05 15:00:00
我需要找出每个用户在给定天数范围内处于特定类型的天数。
例如:对于给定的范围2019-02-01到2019-03-04,输出应该是
--------------------------------
user | type | No. of days
--------------------------------
1 | 1 | 7
1 | 3 | 17
2 | 3 | 6
3 | 1 | 29
2 | 4 | 16
2 | 3 | 8
用户可以随时在类型之间切换,但我需要捕获所有这些切换以及用户使用某种类型的天数。我目前通过在 JS 中手动获取所有值和过滤内容来解决这个问题。有没有办法通过 SQL 查询来做到这一点?我使用 MYSQL 5.7.23.
编辑:
上面的输出不正确,但非常感谢大家忽略它并帮助我进行正确的查询。这是这个问题的正确输出 -
--------------------------------
user | type | No. of days
--------------------------------
1 | 1 | 7
1 | 3 | 19
2 | 3 | 5
3 | 1 | 29
3 | 2 | 1
3 | 3 | 8
3 | 4 | 16
使用 lead()
然后 datediff()
和 sum()
以及大量日期比较:
select user, type,
sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
from (select t.*,
lead(timestamp, 1, '2019-03-04') over (partition by user order by timestamp) as next_ts
from t
) t
where next_ts >= '2019-02-01' and
timestamp <= '2019-03-04'
group by user, type;
编辑:
在旧版本中,您可以使用:
select user, type,
sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
from (select t.*,
(select coalesce(min(timestamp), '2019-03-04')
from t t2
where t2.user = t.user and t2.timestamp > t.timestamp
) as next_ts
from t
) t
where next_ts >= '2019-02-01' and
timestamp <= '2019-03-04'
group by user, type;
这应该能满足您的需求:
select id, user, type, time_stamp, (
select datediff(min(time_stamp), t1.time_stamp)
from table1 as t2
where t2.user = t1.user
and t2.time_stamp > t1.time_stamp
) as days
from table1 as t1
where 0 < (select count(*) from table1 as t3 where t3.user = t1.user
and t3.time_stamp > t1.time_stamp )
order by id;
在这里 fiddle 工作:http://sqlfiddle.com/#!9/347ab5/26
如果您还希望每个用户的 "final" 行使用此变体:
select id, user, type, time_stamp, (
select datediff(coalesce(min(time_stamp),current_timestamp()) , t1.time_stamp)
from table1 as t2
where t2.user = t1.user
and t2.time_stamp > t1.time_stamp
) as days
from table1 as t1
order by id;
你得到的并不完全如你所愿,但它是准确的
SELECT
`user`
,`type`
,dategone `No. of days`
FROM
(SELECT
`type`,
IF(@id = `user`,DATEDIFF(`timestamp` , @days), -1) dategone #
,@id := `user` `user`
,@days := `timestamp`
FROM
(SELECT
`D`, `user`, `type`, `timestamp`
From table1
ORDER BY `user` ASC, `timestamp` ASC) a
, (SELECT @days :=0) b, (SELECT @id :=0) c) d
WHERE dategone > -1;
CREATE TABLE table1 (
`D` INTEGER,
`user` INTEGER,
`type` INTEGER,
`timestamp` VARCHAR(19)
);
INSERT INTO table1
(`D`, `user`, `type`, `timestamp`)
VALUES
('1', '1', '1', '2019-02-08 15:00:00'),
('2', '1', '3', '2019-02-15 15:00:00'),
('3', '1', '2', '2019-03-06 15:00:00'),
('4', '2', '3', '2019-02-01 15:00:00'),
('5', '2', '1', '2019-02-06 15:00:00'),
('6', '3', '1', '2019-01-10 15:00:00'),
('7', '3', '4', '2019-02-08 15:00:00'),
('8', '3', '3', '2019-02-24 15:00:00'),
('9', '3', '2', '2019-03-04 15:00:00'),
('10', '3', '3', '2019-03-05 15:00:00');
✓
✓
SELECT
`user`
,`type`
,dategone `No. of days`
FROM
(SELECT
`type`,
IF(@id = `user`,DATEDIFF(`timestamp` , @days), -1) dategone #
,@id := `user` `user`
,@days := `timestamp`
FROM
(SELECT
`D`, `user`, `type`, `timestamp`
From table1
ORDER BY `user` ASC, `timestamp` ASC) a, (SELECT @days :=0) b, (SELECT @id :=0) c) d
WHERE dategone > -1;
user | type | No. of days
---: | ---: | ----------:
1 | 3 | 7
1 | 2 | 19
2 | 1 | 5
3 | 4 | 29
3 | 3 | 16
3 | 2 | 8
3 | 3 | 1
db<>fiddle here
这是在 MysQL 5.7 中没有用户变量的一种方法:
select
t.user,
t.type,
sum(datediff(
greatest(tlead.timestamp, '2019-02-01'),
least(t.timestamp, '2019-03-04'))
) no_of_days
from mytable t
inner join mytable tlead
on tlead.user = t.user
and tlead.timestamp > t.timestamp
and not exists (
select 1
from mytable t1
where
t1.user = t.user
and t1.timestamp > t.timestamp
and t1.timestamp < tlead.timestamp
)
where tlead.timestamp >= '2019-02-01' and t.timestamp <= '2019-03-04'
group by t.user, t.type
order by t.user, t.type
这基本上用自连接和 not exists
条件模拟 lead()
:table 别名 tlead
是 next 为同一用户记录。剩下的就是过滤、聚合和计算目标日期范围内的日期差异。
Demo on DB Fiddle - 结果与您的不完全相同,但我怀疑它们实际上是正确的:
user | type | no_of_days
---: | ---: | ---------:
1 | 1 | 7
1 | 3 | 19
2 | 3 | 5
3 | 1 | 29
3 | 2 | 1
3 | 3 | 8
3 | 4 | 16
在此table -
----------------------------------------------
ID | user | type | timestamp
----------------------------------------------
1 | 1 | 1 | 2019-02-08 15:00:00
2 | 1 | 3 | 2019-02-15 15:00:00
3 | 1 | 2 | 2019-03-06 15:00:00
4 | 2 | 3 | 2019-02-01 15:00:00
5 | 2 | 1 | 2019-02-06 15:00:00
6 | 3 | 1 | 2019-01-10 15:00:00
7 | 3 | 4 | 2019-02-08 15:00:00
8 | 3 | 3 | 2019-02-24 15:00:00
9 | 3 | 2 | 2019-03-04 15:00:00
10 | 3 | 3 | 2019-03-05 15:00:00
我需要找出每个用户在给定天数范围内处于特定类型的天数。
例如:对于给定的范围2019-02-01到2019-03-04,输出应该是
--------------------------------
user | type | No. of days
--------------------------------
1 | 1 | 7
1 | 3 | 17
2 | 3 | 6
3 | 1 | 29
2 | 4 | 16
2 | 3 | 8
用户可以随时在类型之间切换,但我需要捕获所有这些切换以及用户使用某种类型的天数。我目前通过在 JS 中手动获取所有值和过滤内容来解决这个问题。有没有办法通过 SQL 查询来做到这一点?我使用 MYSQL 5.7.23.
编辑:
上面的输出不正确,但非常感谢大家忽略它并帮助我进行正确的查询。这是这个问题的正确输出 -
--------------------------------
user | type | No. of days
--------------------------------
1 | 1 | 7
1 | 3 | 19
2 | 3 | 5
3 | 1 | 29
3 | 2 | 1
3 | 3 | 8
3 | 4 | 16
使用 lead()
然后 datediff()
和 sum()
以及大量日期比较:
select user, type,
sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
from (select t.*,
lead(timestamp, 1, '2019-03-04') over (partition by user order by timestamp) as next_ts
from t
) t
where next_ts >= '2019-02-01' and
timestamp <= '2019-03-04'
group by user, type;
编辑:
在旧版本中,您可以使用:
select user, type,
sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
from (select t.*,
(select coalesce(min(timestamp), '2019-03-04')
from t t2
where t2.user = t.user and t2.timestamp > t.timestamp
) as next_ts
from t
) t
where next_ts >= '2019-02-01' and
timestamp <= '2019-03-04'
group by user, type;
这应该能满足您的需求:
select id, user, type, time_stamp, (
select datediff(min(time_stamp), t1.time_stamp)
from table1 as t2
where t2.user = t1.user
and t2.time_stamp > t1.time_stamp
) as days
from table1 as t1
where 0 < (select count(*) from table1 as t3 where t3.user = t1.user
and t3.time_stamp > t1.time_stamp )
order by id;
在这里 fiddle 工作:http://sqlfiddle.com/#!9/347ab5/26
如果您还希望每个用户的 "final" 行使用此变体:
select id, user, type, time_stamp, (
select datediff(coalesce(min(time_stamp),current_timestamp()) , t1.time_stamp)
from table1 as t2
where t2.user = t1.user
and t2.time_stamp > t1.time_stamp
) as days
from table1 as t1
order by id;
你得到的并不完全如你所愿,但它是准确的
SELECT
`user`
,`type`
,dategone `No. of days`
FROM
(SELECT
`type`,
IF(@id = `user`,DATEDIFF(`timestamp` , @days), -1) dategone #
,@id := `user` `user`
,@days := `timestamp`
FROM
(SELECT
`D`, `user`, `type`, `timestamp`
From table1
ORDER BY `user` ASC, `timestamp` ASC) a
, (SELECT @days :=0) b, (SELECT @id :=0) c) d
WHERE dategone > -1;
CREATE TABLE table1 ( `D` INTEGER, `user` INTEGER, `type` INTEGER, `timestamp` VARCHAR(19) ); INSERT INTO table1 (`D`, `user`, `type`, `timestamp`) VALUES ('1', '1', '1', '2019-02-08 15:00:00'), ('2', '1', '3', '2019-02-15 15:00:00'), ('3', '1', '2', '2019-03-06 15:00:00'), ('4', '2', '3', '2019-02-01 15:00:00'), ('5', '2', '1', '2019-02-06 15:00:00'), ('6', '3', '1', '2019-01-10 15:00:00'), ('7', '3', '4', '2019-02-08 15:00:00'), ('8', '3', '3', '2019-02-24 15:00:00'), ('9', '3', '2', '2019-03-04 15:00:00'), ('10', '3', '3', '2019-03-05 15:00:00');
✓ ✓
SELECT `user` ,`type` ,dategone `No. of days` FROM (SELECT `type`, IF(@id = `user`,DATEDIFF(`timestamp` , @days), -1) dategone # ,@id := `user` `user` ,@days := `timestamp` FROM (SELECT `D`, `user`, `type`, `timestamp` From table1 ORDER BY `user` ASC, `timestamp` ASC) a, (SELECT @days :=0) b, (SELECT @id :=0) c) d WHERE dategone > -1;
user | type | No. of days ---: | ---: | ----------: 1 | 3 | 7 1 | 2 | 19 2 | 1 | 5 3 | 4 | 29 3 | 3 | 16 3 | 2 | 8 3 | 3 | 1
db<>fiddle here
这是在 MysQL 5.7 中没有用户变量的一种方法:
select
t.user,
t.type,
sum(datediff(
greatest(tlead.timestamp, '2019-02-01'),
least(t.timestamp, '2019-03-04'))
) no_of_days
from mytable t
inner join mytable tlead
on tlead.user = t.user
and tlead.timestamp > t.timestamp
and not exists (
select 1
from mytable t1
where
t1.user = t.user
and t1.timestamp > t.timestamp
and t1.timestamp < tlead.timestamp
)
where tlead.timestamp >= '2019-02-01' and t.timestamp <= '2019-03-04'
group by t.user, t.type
order by t.user, t.type
这基本上用自连接和 not exists
条件模拟 lead()
:table 别名 tlead
是 next 为同一用户记录。剩下的就是过滤、聚合和计算目标日期范围内的日期差异。
Demo on DB Fiddle - 结果与您的不完全相同,但我怀疑它们实际上是正确的:
user | type | no_of_days ---: | ---: | ---------: 1 | 1 | 7 1 | 3 | 19 2 | 3 | 5 3 | 1 | 29 3 | 2 | 1 3 | 3 | 8 3 | 4 | 16