使用累加和 sql 分区
Working with accumulated sums and sql partitioning
我必须处理一些进程,我需要计算它们在某个日期之前的生命周期,但由于它们可以暂停,所以我遇到了困难。我有一个暂停 table(我把它留在下面)。如果你拿到基准日的停牌时间(总天数),我就可以解决问题。我想到了使用累加的和,但是作为一个过程可以暂停几次,这个和是行不通的。在table里面我有进程ID,我要的基准日期是暂停时间和暂停日期。 table 很直观。
例如,进程 2301194 有两次暂停,当我进行累加和时,在 Sqlite 中得到以下结果
SELECT
*,
SUM(TIME_SUSPENSION) OVER (PARTITION BY ID_PROCESS ORDER BY DATA_BASE) TIME_AUX
FROM
(
SELECT
*,
JULIANDAY(DATA_BASE) - JULIANDAY(DATA_SUSPENSION) TIME_SUSPENSION
FROM
SUSPENSIONS
ORDER BY
ID_PROCESS,
DATA_BASE)
WHERE ID_PROCESS = 2301194;
什么时候可以得到想要的结果
在这种情况下,这将是该日期(基准日期)的时间加上到最后一次暂停为止的累计时间。
数据https://raw.githubusercontent.com/jacksonMaike/database/master/trego/suspensions.csv
为了方便起见,我在存储库中留下了一个包含 table 的 .cd。
有人有什么建议吗?提前致谢!
您可以使用 CTE 计算到最后一次暂停的累计时间,然后加入 table:
WITH cte AS (
SELECT DATA_SUSPENSION,
SUM(MAX(JULIANDAY(DATA_BASE) - JULIANDAY(DATA_SUSPENSION)))
OVER (ORDER BY DATA_SUSPENSION ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) ACC_TIME_SUSPENSION
FROM SUSPENSIONS
WHERE ID_PROCESS = 2301194
GROUP BY DATA_SUSPENSION
)
SELECT s.*,
JULIANDAY(s.DATA_BASE) - JULIANDAY(s.DATA_SUSPENSION) TIME_SUSPENSION,
JULIANDAY(s.DATA_BASE) - JULIANDAY(s.DATA_SUSPENSION) + COALESCE(c.ACC_TIME_SUSPENSION, 0) TIME_AUX
FROM SUSPENSIONS s INNER JOIN cte c
ON c.DATA_SUSPENSION = s.DATA_SUSPENSION
ORDER BY s.DATA_BASE;
或者,对于所有 ID_PROCESS
:
WITH cte AS (
SELECT ID_PROCESS, DATA_SUSPENSION,
SUM(MAX(JULIANDAY(DATA_BASE) - JULIANDAY(DATA_SUSPENSION)))
OVER (
PARTITION BY ID_PROCESS
ORDER BY DATA_SUSPENSION ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
) ACC_TIME_SUSPENSION
FROM SUSPENSIONS
GROUP BY ID_PROCESS, DATA_SUSPENSION
)
SELECT s.*,
JULIANDAY(s.DATA_BASE) - JULIANDAY(s.DATA_SUSPENSION) TIME_SUSPENSION,
JULIANDAY(s.DATA_BASE) - JULIANDAY(s.DATA_SUSPENSION) + COALESCE(c.ACC_TIME_SUSPENSION, 0) TIME_AUX
FROM SUSPENSIONS s INNER JOIN cte c
ON c.ID_PROCESS = s.ID_PROCESS AND c.DATA_SUSPENSION = s.DATA_SUSPENSION
ORDER BY s.ID_PROCESS, s.DATA_BASE;
参见demo。
我必须处理一些进程,我需要计算它们在某个日期之前的生命周期,但由于它们可以暂停,所以我遇到了困难。我有一个暂停 table(我把它留在下面)。如果你拿到基准日的停牌时间(总天数),我就可以解决问题。我想到了使用累加的和,但是作为一个过程可以暂停几次,这个和是行不通的。在table里面我有进程ID,我要的基准日期是暂停时间和暂停日期。 table 很直观。
例如,进程 2301194 有两次暂停,当我进行累加和时,在 Sqlite 中得到以下结果
SELECT
*,
SUM(TIME_SUSPENSION) OVER (PARTITION BY ID_PROCESS ORDER BY DATA_BASE) TIME_AUX
FROM
(
SELECT
*,
JULIANDAY(DATA_BASE) - JULIANDAY(DATA_SUSPENSION) TIME_SUSPENSION
FROM
SUSPENSIONS
ORDER BY
ID_PROCESS,
DATA_BASE)
WHERE ID_PROCESS = 2301194;
什么时候可以得到想要的结果
在这种情况下,这将是该日期(基准日期)的时间加上到最后一次暂停为止的累计时间。
数据https://raw.githubusercontent.com/jacksonMaike/database/master/trego/suspensions.csv 为了方便起见,我在存储库中留下了一个包含 table 的 .cd。
有人有什么建议吗?提前致谢!
您可以使用 CTE 计算到最后一次暂停的累计时间,然后加入 table:
WITH cte AS (
SELECT DATA_SUSPENSION,
SUM(MAX(JULIANDAY(DATA_BASE) - JULIANDAY(DATA_SUSPENSION)))
OVER (ORDER BY DATA_SUSPENSION ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) ACC_TIME_SUSPENSION
FROM SUSPENSIONS
WHERE ID_PROCESS = 2301194
GROUP BY DATA_SUSPENSION
)
SELECT s.*,
JULIANDAY(s.DATA_BASE) - JULIANDAY(s.DATA_SUSPENSION) TIME_SUSPENSION,
JULIANDAY(s.DATA_BASE) - JULIANDAY(s.DATA_SUSPENSION) + COALESCE(c.ACC_TIME_SUSPENSION, 0) TIME_AUX
FROM SUSPENSIONS s INNER JOIN cte c
ON c.DATA_SUSPENSION = s.DATA_SUSPENSION
ORDER BY s.DATA_BASE;
或者,对于所有 ID_PROCESS
:
WITH cte AS (
SELECT ID_PROCESS, DATA_SUSPENSION,
SUM(MAX(JULIANDAY(DATA_BASE) - JULIANDAY(DATA_SUSPENSION)))
OVER (
PARTITION BY ID_PROCESS
ORDER BY DATA_SUSPENSION ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
) ACC_TIME_SUSPENSION
FROM SUSPENSIONS
GROUP BY ID_PROCESS, DATA_SUSPENSION
)
SELECT s.*,
JULIANDAY(s.DATA_BASE) - JULIANDAY(s.DATA_SUSPENSION) TIME_SUSPENSION,
JULIANDAY(s.DATA_BASE) - JULIANDAY(s.DATA_SUSPENSION) + COALESCE(c.ACC_TIME_SUSPENSION, 0) TIME_AUX
FROM SUSPENSIONS s INNER JOIN cte c
ON c.ID_PROCESS = s.ID_PROCESS AND c.DATA_SUSPENSION = s.DATA_SUSPENSION
ORDER BY s.ID_PROCESS, s.DATA_BASE;
参见demo。