查看 sql 中按时间间隔分组的二次请求分布

See the distribution of secondary requests grouped by time interval in sql

我有以下 table:

RequestId,Type, Date,        ParentRequestId
1         1     2020-10-15    null 
2         2     2020-10-19    1 
3         1     2020-10-20    null 
4         2     2020-11-15    3

对于此示例,我对请求类型 1 和 2 感兴趣,以使示例更简单。我的任务是查询一个大数据库,并根据与父事务的日期差异查看二级事务的分布。所以结果看起来像:

Interval,Percentage
0-7 days,50 %
8-15 days,0 %
16-50 days, 50 % 

因此,对于预期结果的第一行,我们有 ID 为 2 的请求,对于预期结果的第三行,我们有 ID 为 4 的请求,因为日期差异符合此间隔。

如何实现?

我正在使用 sql 服务器 2014。

我们希望看到您的尝试,但从表面上看,您似乎需要将此 table 视为 2 table 并进行基本的 GROUP BY , 但通过在 CASE 语句上分组使其变得 fancy

WITH dateDiffs as (
    /* perform our date calculations first, to get that out of the way */
    SELECT 
      DATEDIFF(Day, parent.[Date], child.[Date]) as daysDiff,
      1 as rowsFound
    FROM    (SELECT RequestID, [Date] FROM myTable WHERE Type = 1) parent
    INNER JOIN  (SELECT ParentRequestID, [Date] FROM myTable WHERE Type = 2) child
    ON parent.requestID = child.parentRequestID
)

/* Now group and aggregate and enjoy your maths! */
SELECT 
  case when daysDiff between 0 and 7 then '0-7'
       when daysDiff between 8 and 15 then '8-15'
       when daysDiff between 16 and 50 THEN '16-50'
       else '50+' 
   end as myInterval,
   sum(rowsFound) as totalFound,
   (select sum(rowsFound) from dateDiffs) as totalRows,
   1.0 * sum(rowsFound) / (select sum(rowsFound) from dateDiffs) * 100.00 as percentFound
FROM dateDiffs
GROUP BY 
   case when daysDiff between 0 and 7 then '0-7'
       when daysDiff between 8 and 15 then '8-15'
       when daysDiff between 16 and 50 THEN '16-50'
       else '50+' 
   end;

这看起来基本上是一个 joingroup by 查询:

with dates as (
      select 0 as lo, 7 as hi, '0-7 days' as grp union all
      select 8 as lo, 15 as hi, '8-15 days' union all
      select 16 as lo, 50 as hi, '16-50 days'
     )      
select d.grp,
       count(*) as cnt,
       count(*) * 1.0 / sum(count(*)) over () as raio
from dates left join
     (t join
      t tp
      on tp.RequestId = t. ParentRequestId
     )
     on datediff(day, tp.date, t.date) between d.lo and d.hi
group by d.grp
order by d.lo;

唯一的技巧是生成所有日期组,因此您的行具有零值。