Oracle SQL:不规则时间戳之间的计数
Oracle SQL: Count between irregular timestamps
如果这个问题以前在这里被问过,我很抱歉,但我似乎找不到它。我一直在寻找每小时的总和,但我的问题是关于另一列中定义的时间戳之间的 SUM 和 COUNT。
我有一个 table 叫做 incoming_orders:它显示预定的目的地,以及收到订单的时间戳。
我有第二个 table 叫做 scheduled_output:它显示每个目的地的每个预定输出时刻。
我有第三个 table 叫 outgoing_orders:它显示了实际目的地,以及外发订单的时间戳。
因此,数据可能是:
--Incoming_orders:
Destination Timestamp
ROUTE B 14/03/2018 7:48:00
ROUTE A 14/03/2018 7:58:00
ROUTE A 14/03/2018 12:48:00
ROUTE C 14/03/2018 13:28:00
--Scheduled_Output
ROUTE A 14/03/2018 8:00:00
ROUTE A 14/03/2018 11:00:00
ROUTE A 14/03/2018 12:00:00
ROUTE A 14/03/2018 17:00:00
ROUTE B 14/03/2018 8:00:00
ROUTE B 14/03/2018 10:00:00
ROUTE B 14/03/2018 12:00:00
ROUTE C 14/03/2018 07:00:00
ROUTE C 14/03/2018 14:00:00
ROUTE C 14/03/2018 17:00:00
--Which would lead to the following outgoing_orders:
ROUTE A 14/03/2018 8:00:00
ROUTE B 14/03/2018 8:00:00
ROUTE C 14/03/2018 14:00:00
ROUTE A 14/03/2018 17:00:00
现在,我想检查 07:58 到路由 A 的传入订单是否确实进入了路由 A 的 08:00 的输出循环。我正在考虑创建一个 table 像这样显示它:
Destination output moment expected_output actual_output diff
Route A 8:00 1 1 0
Route A 11:00 0 0 0
Route A 12:00 0 0 0
Route A 17:00 1 1 0
但问题是:如何计算 expected_output 列?如何将 12:48 的路由 A 的传入订单分组到 12:00-17:00 组?它应该计算预定输出时刻之间的所有订单,但我不确定如何实现。
我能否将 CEIL、FLOOR 或 ROUND 设为最接近的 scheduled_output 值?或者我能以某种方式在第 n 行和第 n+1 行之间做一个行数吗?或者还有其他更简单的方法吗?
我觉得确定定时输出的上一次时间最简单,获取时间间隔,大致就是这样:
SELECT destination,
time_stamp,
( SELECT max( time_stamp )
FROM SCHEDULED_OUTPUT t1
WHERE t.destination = t1.destination
AND t1.time_stamp < t.time_stamp
) as previous_time_stamp
FROM SCHEDULED_OUTPUT t
order by 1,2
或使用分析函数以更紧凑的形式:
SELECT destination,
time_stamp,
lag( time_stamp ) over (partition by destination order by time_stamp )
as previous_time_stamp
FROM SCHEDULED_OUTPUT t
order by 1,2
演示:http://sqlfiddle.com/#!4/c7bc9/1
| DESTINATION | TIME_STAMP | PREVIOUS_TIME_STAMP |
|-------------|-----------------------|-----------------------|
| ROUTE A | 2018-03-14 08:00:00.0 | (null) |
| ROUTE A | 2018-03-14 11:00:00.0 | 2018-03-14 08:00:00.0 |
| ROUTE A | 2018-03-14 12:00:00.0 | 2018-03-14 11:00:00.0 |
| ROUTE A | 2018-03-14 17:00:00.0 | 2018-03-14 12:00:00.0 |
| ROUTE B | 2018-03-14 08:00:00.0 | (null) |
| ROUTE B | 2018-03-14 10:00:00.0 | 2018-03-14 08:00:00.0 |
| ROUTE B | 2018-03-14 12:00:00.0 | 2018-03-14 10:00:00.0 |
| ROUTE C | 2018-03-14 07:00:00.0 | (null) |
| ROUTE C | 2018-03-14 14:00:00.0 | 2018-03-14 07:00:00.0 |
| ROUTE C | 2018-03-14 17:00:00.0 | 2018-03-14 14:00:00.0 |
接下来可以将上述结果集加入 INCOMING_ORDERS 以计算计数:
SELECT x.destination, x.time_stamp as output_moment,
count( y.DESTINATION ) as expected_output
FROM (
SELECT destination,
time_stamp,
lag( time_stamp ) over (partition by destination order by time_stamp )
as previous_time_stamp
FROM SCHEDULED_OUTPUT t
) x
LEFT JOIN INCOMING_ORDERS y
ON x.DESTINATION = y.DESTINATION
AND y.TIME_STAMP <= x.TIME_STAMP
AND ( y.TIME_STAMP > x.previous_time_stamp OR x.previous_time_stamp IS NULL )
GROUP BY x.destination, x.time_stamp
ORDER BY 1,2
演示:http://sqlfiddle.com/#!4/c3958/2
| DESTINATION | OUTPUT_MOMENT | EXPECTED_OUTPUT |
|-------------|-----------------------|-----------------|
| ROUTE A | 2018-03-14 08:00:00.0 | 1 |
| ROUTE A | 2018-03-14 11:00:00.0 | 0 |
| ROUTE A | 2018-03-14 12:00:00.0 | 0 |
| ROUTE A | 2018-03-14 17:00:00.0 | 1 |
| ROUTE B | 2018-03-14 08:00:00.0 | 1 |
| ROUTE B | 2018-03-14 10:00:00.0 | 0 |
| ROUTE B | 2018-03-14 12:00:00.0 | 0 |
| ROUTE C | 2018-03-14 07:00:00.0 | 0 |
| ROUTE C | 2018-03-14 14:00:00.0 | 1 |
| ROUTE C | 2018-03-14 17:00:00.0 | 0 |
这个条件:
AND y.TIME_STAMP <= x.TIME_STAMP
AND ( y.TIME_STAMP > x.previous_time_stamp OR x.previous_time_stamp IS NULL )
表示如果在 8:00:00 下订单并且路线同时在 8:00:00 开始,则此订单仍分配给此 "starting" 路线。如果这是不可能的(即 - 订单必须在路线开始的确切时间放置时分配给下一条路线),然后将条件更改为:
AND y.TIME_STAMP < x.TIME_STAMP
AND ( y.TIME_STAMP >= x.previous_time_stamp OR x.previous_time_stamp IS NULL )
如果这个问题以前在这里被问过,我很抱歉,但我似乎找不到它。我一直在寻找每小时的总和,但我的问题是关于另一列中定义的时间戳之间的 SUM 和 COUNT。
我有一个 table 叫做 incoming_orders:它显示预定的目的地,以及收到订单的时间戳。
我有第二个 table 叫做 scheduled_output:它显示每个目的地的每个预定输出时刻。
我有第三个 table 叫 outgoing_orders:它显示了实际目的地,以及外发订单的时间戳。
因此,数据可能是:
--Incoming_orders:
Destination Timestamp
ROUTE B 14/03/2018 7:48:00
ROUTE A 14/03/2018 7:58:00
ROUTE A 14/03/2018 12:48:00
ROUTE C 14/03/2018 13:28:00
--Scheduled_Output
ROUTE A 14/03/2018 8:00:00
ROUTE A 14/03/2018 11:00:00
ROUTE A 14/03/2018 12:00:00
ROUTE A 14/03/2018 17:00:00
ROUTE B 14/03/2018 8:00:00
ROUTE B 14/03/2018 10:00:00
ROUTE B 14/03/2018 12:00:00
ROUTE C 14/03/2018 07:00:00
ROUTE C 14/03/2018 14:00:00
ROUTE C 14/03/2018 17:00:00
--Which would lead to the following outgoing_orders:
ROUTE A 14/03/2018 8:00:00
ROUTE B 14/03/2018 8:00:00
ROUTE C 14/03/2018 14:00:00
ROUTE A 14/03/2018 17:00:00
现在,我想检查 07:58 到路由 A 的传入订单是否确实进入了路由 A 的 08:00 的输出循环。我正在考虑创建一个 table 像这样显示它:
Destination output moment expected_output actual_output diff
Route A 8:00 1 1 0
Route A 11:00 0 0 0
Route A 12:00 0 0 0
Route A 17:00 1 1 0
但问题是:如何计算 expected_output 列?如何将 12:48 的路由 A 的传入订单分组到 12:00-17:00 组?它应该计算预定输出时刻之间的所有订单,但我不确定如何实现。
我能否将 CEIL、FLOOR 或 ROUND 设为最接近的 scheduled_output 值?或者我能以某种方式在第 n 行和第 n+1 行之间做一个行数吗?或者还有其他更简单的方法吗?
我觉得确定定时输出的上一次时间最简单,获取时间间隔,大致就是这样:
SELECT destination,
time_stamp,
( SELECT max( time_stamp )
FROM SCHEDULED_OUTPUT t1
WHERE t.destination = t1.destination
AND t1.time_stamp < t.time_stamp
) as previous_time_stamp
FROM SCHEDULED_OUTPUT t
order by 1,2
或使用分析函数以更紧凑的形式:
SELECT destination,
time_stamp,
lag( time_stamp ) over (partition by destination order by time_stamp )
as previous_time_stamp
FROM SCHEDULED_OUTPUT t
order by 1,2
演示:http://sqlfiddle.com/#!4/c7bc9/1
| DESTINATION | TIME_STAMP | PREVIOUS_TIME_STAMP |
|-------------|-----------------------|-----------------------|
| ROUTE A | 2018-03-14 08:00:00.0 | (null) |
| ROUTE A | 2018-03-14 11:00:00.0 | 2018-03-14 08:00:00.0 |
| ROUTE A | 2018-03-14 12:00:00.0 | 2018-03-14 11:00:00.0 |
| ROUTE A | 2018-03-14 17:00:00.0 | 2018-03-14 12:00:00.0 |
| ROUTE B | 2018-03-14 08:00:00.0 | (null) |
| ROUTE B | 2018-03-14 10:00:00.0 | 2018-03-14 08:00:00.0 |
| ROUTE B | 2018-03-14 12:00:00.0 | 2018-03-14 10:00:00.0 |
| ROUTE C | 2018-03-14 07:00:00.0 | (null) |
| ROUTE C | 2018-03-14 14:00:00.0 | 2018-03-14 07:00:00.0 |
| ROUTE C | 2018-03-14 17:00:00.0 | 2018-03-14 14:00:00.0 |
接下来可以将上述结果集加入 INCOMING_ORDERS 以计算计数:
SELECT x.destination, x.time_stamp as output_moment,
count( y.DESTINATION ) as expected_output
FROM (
SELECT destination,
time_stamp,
lag( time_stamp ) over (partition by destination order by time_stamp )
as previous_time_stamp
FROM SCHEDULED_OUTPUT t
) x
LEFT JOIN INCOMING_ORDERS y
ON x.DESTINATION = y.DESTINATION
AND y.TIME_STAMP <= x.TIME_STAMP
AND ( y.TIME_STAMP > x.previous_time_stamp OR x.previous_time_stamp IS NULL )
GROUP BY x.destination, x.time_stamp
ORDER BY 1,2
演示:http://sqlfiddle.com/#!4/c3958/2
| DESTINATION | OUTPUT_MOMENT | EXPECTED_OUTPUT |
|-------------|-----------------------|-----------------|
| ROUTE A | 2018-03-14 08:00:00.0 | 1 |
| ROUTE A | 2018-03-14 11:00:00.0 | 0 |
| ROUTE A | 2018-03-14 12:00:00.0 | 0 |
| ROUTE A | 2018-03-14 17:00:00.0 | 1 |
| ROUTE B | 2018-03-14 08:00:00.0 | 1 |
| ROUTE B | 2018-03-14 10:00:00.0 | 0 |
| ROUTE B | 2018-03-14 12:00:00.0 | 0 |
| ROUTE C | 2018-03-14 07:00:00.0 | 0 |
| ROUTE C | 2018-03-14 14:00:00.0 | 1 |
| ROUTE C | 2018-03-14 17:00:00.0 | 0 |
这个条件:
AND y.TIME_STAMP <= x.TIME_STAMP
AND ( y.TIME_STAMP > x.previous_time_stamp OR x.previous_time_stamp IS NULL )
表示如果在 8:00:00 下订单并且路线同时在 8:00:00 开始,则此订单仍分配给此 "starting" 路线。如果这是不可能的(即 - 订单必须在路线开始的确切时间放置时分配给下一条路线),然后将条件更改为:
AND y.TIME_STAMP < x.TIME_STAMP
AND ( y.TIME_STAMP >= x.previous_time_stamp OR x.previous_time_stamp IS NULL )