查看 sql 中按时间间隔分组的二次请求分布
See the distribution of secondary requests grouped by time interval in sql
我有以下 table:
RequestId,Type, Date, ParentRequestId
1 1 2020-10-15 null
2 2 2020-10-19 1
3 1 2020-10-20 null
4 2 2020-11-15 3
对于此示例,我对请求类型 1 和 2 感兴趣,以使示例更简单。我的任务是查询一个大数据库,并根据与父事务的日期差异查看二级事务的分布。所以结果看起来像:
Interval,Percentage
0-7 days,50 %
8-15 days,0 %
16-50 days, 50 %
因此,对于预期结果的第一行,我们有 ID 为 2 的请求,对于预期结果的第三行,我们有 ID 为 4 的请求,因为日期差异符合此间隔。
如何实现?
我正在使用 sql 服务器 2014。
我们希望看到您的尝试,但从表面上看,您似乎需要将此 table 视为 2 table 并进行基本的 GROUP BY , 但通过在 CASE 语句上分组使其变得 fancy。
WITH dateDiffs as (
/* perform our date calculations first, to get that out of the way */
SELECT
DATEDIFF(Day, parent.[Date], child.[Date]) as daysDiff,
1 as rowsFound
FROM (SELECT RequestID, [Date] FROM myTable WHERE Type = 1) parent
INNER JOIN (SELECT ParentRequestID, [Date] FROM myTable WHERE Type = 2) child
ON parent.requestID = child.parentRequestID
)
/* Now group and aggregate and enjoy your maths! */
SELECT
case when daysDiff between 0 and 7 then '0-7'
when daysDiff between 8 and 15 then '8-15'
when daysDiff between 16 and 50 THEN '16-50'
else '50+'
end as myInterval,
sum(rowsFound) as totalFound,
(select sum(rowsFound) from dateDiffs) as totalRows,
1.0 * sum(rowsFound) / (select sum(rowsFound) from dateDiffs) * 100.00 as percentFound
FROM dateDiffs
GROUP BY
case when daysDiff between 0 and 7 then '0-7'
when daysDiff between 8 and 15 then '8-15'
when daysDiff between 16 and 50 THEN '16-50'
else '50+'
end;
这看起来基本上是一个 join
和 group by
查询:
with dates as (
select 0 as lo, 7 as hi, '0-7 days' as grp union all
select 8 as lo, 15 as hi, '8-15 days' union all
select 16 as lo, 50 as hi, '16-50 days'
)
select d.grp,
count(*) as cnt,
count(*) * 1.0 / sum(count(*)) over () as raio
from dates left join
(t join
t tp
on tp.RequestId = t. ParentRequestId
)
on datediff(day, tp.date, t.date) between d.lo and d.hi
group by d.grp
order by d.lo;
唯一的技巧是生成所有日期组,因此您的行具有零值。
我有以下 table:
RequestId,Type, Date, ParentRequestId
1 1 2020-10-15 null
2 2 2020-10-19 1
3 1 2020-10-20 null
4 2 2020-11-15 3
对于此示例,我对请求类型 1 和 2 感兴趣,以使示例更简单。我的任务是查询一个大数据库,并根据与父事务的日期差异查看二级事务的分布。所以结果看起来像:
Interval,Percentage
0-7 days,50 %
8-15 days,0 %
16-50 days, 50 %
因此,对于预期结果的第一行,我们有 ID 为 2 的请求,对于预期结果的第三行,我们有 ID 为 4 的请求,因为日期差异符合此间隔。
如何实现?
我正在使用 sql 服务器 2014。
我们希望看到您的尝试,但从表面上看,您似乎需要将此 table 视为 2 table 并进行基本的 GROUP BY , 但通过在 CASE 语句上分组使其变得 fancy。
WITH dateDiffs as (
/* perform our date calculations first, to get that out of the way */
SELECT
DATEDIFF(Day, parent.[Date], child.[Date]) as daysDiff,
1 as rowsFound
FROM (SELECT RequestID, [Date] FROM myTable WHERE Type = 1) parent
INNER JOIN (SELECT ParentRequestID, [Date] FROM myTable WHERE Type = 2) child
ON parent.requestID = child.parentRequestID
)
/* Now group and aggregate and enjoy your maths! */
SELECT
case when daysDiff between 0 and 7 then '0-7'
when daysDiff between 8 and 15 then '8-15'
when daysDiff between 16 and 50 THEN '16-50'
else '50+'
end as myInterval,
sum(rowsFound) as totalFound,
(select sum(rowsFound) from dateDiffs) as totalRows,
1.0 * sum(rowsFound) / (select sum(rowsFound) from dateDiffs) * 100.00 as percentFound
FROM dateDiffs
GROUP BY
case when daysDiff between 0 and 7 then '0-7'
when daysDiff between 8 and 15 then '8-15'
when daysDiff between 16 and 50 THEN '16-50'
else '50+'
end;
这看起来基本上是一个 join
和 group by
查询:
with dates as (
select 0 as lo, 7 as hi, '0-7 days' as grp union all
select 8 as lo, 15 as hi, '8-15 days' union all
select 16 as lo, 50 as hi, '16-50 days'
)
select d.grp,
count(*) as cnt,
count(*) * 1.0 / sum(count(*)) over () as raio
from dates left join
(t join
t tp
on tp.RequestId = t. ParentRequestId
)
on datediff(day, tp.date, t.date) between d.lo and d.hi
group by d.grp
order by d.lo;
唯一的技巧是生成所有日期组,因此您的行具有零值。