Postgresql:查询求和状态之间的日期差异

Postgresql: query to sum date difference between status

我有一个 table 来监控一些带有日期时间和状态的设备。我想计算 "RUN" 和 "STOP" 状态之间的 "running days"。 我尝试以下请求:

select run.stamp - 
(
    -- select the first STOP status after the current RUN status
      select stamp 
      from well_monitoring 
      where stamp > run.stamp and status = 'STOP'
      order by stamp limit 1
  )
from well_monitoring run
where
  run.status = 'RUN'
  and ( -- we want only the first RUN
      select status 
      from well_monitoring 
      where stamp < run.stamp 
    order by stamp desc limit 1) <> 'RUN'
order by run.stamp

查看 SQLFiddle table 创建/数据并测试请求。

当我尝试求和以获得总 运行 天时:

select SUM( run.stamp - ... ) ...

我有以下错误:

ERROR: column "run.stamp" must appear in the GROUP BY clause or be used in an aggregate function Position: 448

所以: - 我怎样才能更新我的查询以获得总和? - 查询有 2 个子查询,是否有更好的方法(cte?)?

(Postgres 版本: 9.1.7)

您应该将查询包装为:

select sum(times) as sum_of_times 
from (

  select run.stamp - 
    (
      -- select the first STOP status after the current RUN status
        select stamp 
        from well_monitoring 
        where stamp > run.stamp and status = 'STOP'
        order by stamp limit 1
    ) times
  from well_monitoring run
  where
    run.status = 'RUN'
    and ( -- we want only the first RUN
        select status 
        from well_monitoring 
        where stamp < run.stamp 
      order by stamp desc limit 1) <> 'RUN'
  order by run.stamp

) alias

喜欢这个sqlfiddle

在此 sqlfiddle 中,您可以了解如何使用 CTE,以防您希望在一个结果集中包含所有时间和时间总和。

这可能会稍微简化一点,但它确实有效(请注意它确实需要 window 函数,因此它可以适应任何 SQL 实现)

SELECT one.id , one.stamp, one.status
        , two.id, two.stamp, two.status
        , (two.stamp - one.stamp) AS diff
FROM well_monitoring one
JOIN well_monitoring two ON two.well_id = one.well_id
        AND two.stamp > one.stamp
        AND two.status = 'STOP'
        -- find the first STOP:
        -- there should be on other STOP
        -- between one: RUN
        --     and two: STOP
        AND NOT EXISTS (
                SELECT * FROM well_monitoring x
                WHERE x.well_id = one.well_id
                AND x.stamp > one.stamp
                AND x.stamp < two.stamp
                AND x.status = 'STOP'
                )
WHERE one.status = 'RUN'
        -- If there are consecutive RUNs
        -- (without an intervening STOP)
        -- one should be the first RUN
AND NOT EXISTS (
        SELECT * FROM well_monitoring x
        WHERE x.well_id = one.well_id
        AND x.status = 'RUN'
        AND x.stamp < one.stamp
        AND NOT EXISTS (
                SELECT * FROM well_monitoring xx
                WHERE xx.well_id = x.well_id
                AND xx.stamp > x.stamp
                AND xx.stamp < one.stamp
                AND xx.status <> 'RUN'
                )
        )
        ;

添加聚合留作 reader 的练习。

这与@joop 的回答大致相同,但使用 window 函数进行边缘检测。请注意,需要 rank+not exists() 才能将停止事件与其最近的开始事件配对。

WITH edges AS (
        SELECT id AS this_id
        , status AS this_status
        , well_id AS well_id
        , LAG(status) over ww AS prev_status
        , dense_rank() over ww AS rnk
        FROM well_monitoring
        WINDOW ww AS (partition by well_id ORDER BY stamp)
        )
, starters AS (
        SELECT this_id, well_id, rnk
        FROM edges
        WHERE this_status = 'RUN'
        AND COALESCE(prev_status, 'OMG') <> 'RUN'
        )
, stoppers AS (
        SELECT this_id, well_id, rnk
        FROM edges
        WHERE prev_status = 'RUN'
        AND this_status <> 'RUN'
        )
SELECT m0.well_id
        , SUM(m1.stamp - m0.stamp)::interval AS duration
FROM starters s0
JOIN stoppers s1 ON s1.well_id = s0.well_id
        AND s1.rnk > s0.rnk
        AND NOT EXISTS (
                SELECT * FROM stoppers nx
                WHERE nx.well_id = s0.well_id
                AND nx.rnk > s0.rnk AND nx.rnk < s1.rnk
                )
JOIN well_monitoring m0 ON m0.id = s0.this_id
JOIN well_monitoring m1 ON m1.id = s1.this_id
GROUP BY m0.well_id
        ;

结果:

 well_id |     duration      
---------+-------------------
       1 | 320 days 64:28:00
(1 row)

(我怀疑 64 小时是个错误...)