SQL 获取按 table 中另一列分组的给定日期之间的日期范围

SQL get range of dates in between given dates grouped by another column in a table

在此table -

----------------------------------------------
ID  | user   | type   | timestamp
----------------------------------------------
1   | 1      | 1      | 2019-02-08 15:00:00
2   | 1      | 3      | 2019-02-15 15:00:00
3   | 1      | 2      | 2019-03-06 15:00:00
4   | 2      | 3      | 2019-02-01 15:00:00
5   | 2      | 1      | 2019-02-06 15:00:00
6   | 3      | 1      | 2019-01-10 15:00:00
7   | 3      | 4      | 2019-02-08 15:00:00
8   | 3      | 3      | 2019-02-24 15:00:00
9   | 3      | 2      | 2019-03-04 15:00:00
10  | 3      | 3      | 2019-03-05 15:00:00

我需要找出每个用户在给定天数范围内处于特定类型的天数。

例如:对于给定的范围2019-02-012019-03-04,输出应该是

--------------------------------
user   | type   | No. of days
--------------------------------
1      | 1      | 7
1      | 3      | 17
2      | 3      | 6
3      | 1      | 29
2      | 4      | 16
2      | 3      | 8

用户可以随时在类型之间切换,但我需要捕获所有这些切换以及用户使用某种类型的天数。我目前通过在 JS 中手动获取所有值和过滤内容来解决这个问题。有没有办法通过 SQL 查询来做到这一点?我使用 MYSQL 5.7.23.

编辑:

上面的输出不正确,但非常感谢大家忽略它并帮助我进行正确的查询。这是这个问题的正确输出 -

--------------------------------
user | type | No. of days
--------------------------------
   1 |    1 |          7
   1 |    3 |         19
   2 |    3 |          5
   3 |    1 |         29
   3 |    2 |          1
   3 |    3 |          8
   3 |    4 |         16

使用 lead() 然后 datediff()sum() 以及大量日期比较:

select user, type,
       sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
from (select t.*,
             lead(timestamp, 1, '2019-03-04') over (partition by user order by timestamp) as next_ts
      from t
     ) t
where next_ts >= '2019-02-01' and
      timestamp <= '2019-03-04'
group by user, type;

编辑:

在旧版本中,您可以使用:

select user, type,
       sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
from (select t.*,
             (select coalesce(min(timestamp), '2019-03-04')
               from t t2
               where t2.user = t.user and t2.timestamp > t.timestamp
             ) as next_ts
      from t
     ) t
where next_ts >= '2019-02-01' and
      timestamp <= '2019-03-04'
group by user, type;

这应该能满足您的需求:

select id, user, type, time_stamp, (
    select datediff(min(time_stamp), t1.time_stamp)
    from table1 as t2
    where t2.user = t1.user 
    and   t2.time_stamp > t1.time_stamp
    ) as days
from table1 as t1
where 0 < (select count(*) from table1 as t3 where t3.user = t1.user
           and   t3.time_stamp > t1.time_stamp )
order by id;

在这里 fiddle 工作:http://sqlfiddle.com/#!9/347ab5/26

如果您还希望每个用户的 "final" 行使用此变体:

select id, user, type, time_stamp, (
    select datediff(coalesce(min(time_stamp),current_timestamp()) , t1.time_stamp)
    from table1 as t2
    where t2.user = t1.user 
    and   t2.time_stamp > t1.time_stamp
    ) as days
from table1 as t1
order by id;

你得到的并不完全如你所愿,但它是准确的

SELECT 
  `user`
  ,`type`
  ,dategone `No. of days`
  FROM
(SELECT 
  `type`,
  IF(@id = `user`,DATEDIFF(`timestamp` , @days), -1) dategone #
  ,@id := `user`  `user`
  ,@days := `timestamp` 
 FROM
   (SELECT 
      `D`, `user`, `type`, `timestamp`
    From table1
    ORDER BY `user` ASC, `timestamp`  ASC) a
   , (SELECT @days :=0) b, (SELECT @id :=0) c) d
WHERE dategone > -1;
CREATE TABLE table1 (
  `D` INTEGER,
  `user` INTEGER,
  `type` INTEGER,
  `timestamp` VARCHAR(19)
);

INSERT INTO table1
  (`D`, `user`, `type`, `timestamp`)
VALUES
  ('1', '1', '1', '2019-02-08 15:00:00'),
  ('2', '1', '3', '2019-02-15 15:00:00'),
  ('3', '1', '2', '2019-03-06 15:00:00'),
  ('4', '2', '3', '2019-02-01 15:00:00'),
  ('5', '2', '1', '2019-02-06 15:00:00'),
  ('6', '3', '1', '2019-01-10 15:00:00'),
  ('7', '3', '4', '2019-02-08 15:00:00'),
  ('8', '3', '3', '2019-02-24 15:00:00'),
  ('9', '3', '2', '2019-03-04 15:00:00'),
  ('10', '3', '3', '2019-03-05 15:00:00');
✓

✓
SELECT 
  `user`
  ,`type`
  ,dategone `No. of days`
  FROM
(SELECT 
`type`,
IF(@id = `user`,DATEDIFF(`timestamp` , @days), -1) dategone #
,@id := `user`  `user`
,@days := `timestamp` 
FROM
(SELECT 
  `D`, `user`, `type`, `timestamp`
From table1
ORDER BY `user` ASC, `timestamp`  ASC) a, (SELECT @days :=0) b, (SELECT @id :=0) c) d
WHERE dategone > -1;
user | type | No. of days
---: | ---: | ----------:
   1 |    3 |           7
   1 |    2 |          19
   2 |    1 |           5
   3 |    4 |          29
   3 |    3 |          16
   3 |    2 |           8
   3 |    3 |           1

db<>fiddle here

这是在 MysQL 5.7 中没有用户变量的一种方法:

select 
    t.user,
    t.type,
    sum(datediff(
        greatest(tlead.timestamp, '2019-02-01'), 
        least(t.timestamp, '2019-03-04'))
    ) no_of_days
from mytable t
inner join mytable tlead 
    on  tlead.user = t.user
    and tlead.timestamp > t.timestamp
    and not exists (
        select 1
        from mytable t1
        where 
            t1.user = t.user 
            and t1.timestamp > t.timestamp
            and t1.timestamp < tlead.timestamp
    )
where tlead.timestamp >= '2019-02-01' and t.timestamp <= '2019-03-04'
group by t.user, t.type
order by t.user, t.type

这基本上用自连接和 not exists 条件模拟 lead():table 别名 tleadnext 为同一用户记录。剩下的就是过滤、聚合和计算目标日期范围内的日期差异。

Demo on DB Fiddle - 结果与您的不完全相同,但我怀疑它们实际上是正确的:

user | type | no_of_days
---: | ---: | ---------:
   1 |    1 |          7
   1 |    3 |         19
   2 |    3 |          5
   3 |    1 |         29
   3 |    2 |          1
   3 |    3 |          8
   3 |    4 |         16