SQL: 按方向过滤行
SQL: FIlter rows by direction
我有一个 table,包含 2 列日期(时间戳)、状态(布尔值)。
我有很多价值,例如:
| date | status |
|-------------------------- |-------- |
| 2018-11-05T19:04:21.125Z | true |
| 2018-11-05T19:04:22.125Z | true |
| 2018-11-05T19:04:23.125Z | true |
....
我需要得到这样的结果:
| date_from | date_to | status |
|-------------------------- |-------------------------- |-------- |
| 2018-11-05T19:04:21.125Z | 2018-11-05T19:04:27.125Z | true |
| 2018-11-05T19:04:27.125Z | 2018-11-05T19:04:47.125Z | false |
| 2018-11-05T19:04:47.125Z | 2018-11-05T19:04:57.125Z | true |
因此,我需要过滤所有 "same" 值并仅进入 return 状态 true/false。
我这样创建查询:
SELECT max("current_date"), current_status, previous_status
FROM (SELECT date as "current_date",
status as current_status,
(lag(status, 1) OVER (ORDER BY msgtime))::boolean AS previous_status
FROM "table" as table
) as raw_data
group by current_status, previous_status
但作为回应,我只得到不超过 4 个值
是的,您可以使用 LAG
,但您还需要一个 运行 计数器,每次状态更改时都会递增:
WITH cte1 AS (
SELECT date, status, CASE WHEN LAG(status) OVER (ORDER BY date) = status THEN 0 ELSE 1 END AS chg
FROM yourdata
), cte2 AS (
SELECT date, status, SUM(chg) OVER (ORDER BY date) AS grp
FROM cte1
)
SELECT MIN(date) AS date_from, MAX(date) AS date_to, status
FROM cte2
GROUP BY grp, status
ORDER BY date_from
这是一个缺口和孤岛问题。一个典型的方法是使用行号的差异:
select min(date), max(date), status
from (select t.*,
row_number() over (order by date) as seqnum,
row_number() over (partition by status order by date) as seqnum_s
from t
) t
group by status, (seqnum - seqnum_s);
我有一个 table,包含 2 列日期(时间戳)、状态(布尔值)。 我有很多价值,例如:
| date | status |
|-------------------------- |-------- |
| 2018-11-05T19:04:21.125Z | true |
| 2018-11-05T19:04:22.125Z | true |
| 2018-11-05T19:04:23.125Z | true |
....
我需要得到这样的结果:
| date_from | date_to | status |
|-------------------------- |-------------------------- |-------- |
| 2018-11-05T19:04:21.125Z | 2018-11-05T19:04:27.125Z | true |
| 2018-11-05T19:04:27.125Z | 2018-11-05T19:04:47.125Z | false |
| 2018-11-05T19:04:47.125Z | 2018-11-05T19:04:57.125Z | true |
因此,我需要过滤所有 "same" 值并仅进入 return 状态 true/false。
我这样创建查询:
SELECT max("current_date"), current_status, previous_status
FROM (SELECT date as "current_date",
status as current_status,
(lag(status, 1) OVER (ORDER BY msgtime))::boolean AS previous_status
FROM "table" as table
) as raw_data
group by current_status, previous_status
但作为回应,我只得到不超过 4 个值
是的,您可以使用 LAG
,但您还需要一个 运行 计数器,每次状态更改时都会递增:
WITH cte1 AS (
SELECT date, status, CASE WHEN LAG(status) OVER (ORDER BY date) = status THEN 0 ELSE 1 END AS chg
FROM yourdata
), cte2 AS (
SELECT date, status, SUM(chg) OVER (ORDER BY date) AS grp
FROM cte1
)
SELECT MIN(date) AS date_from, MAX(date) AS date_to, status
FROM cte2
GROUP BY grp, status
ORDER BY date_from
这是一个缺口和孤岛问题。一个典型的方法是使用行号的差异:
select min(date), max(date), status
from (select t.*,
row_number() over (order by date) as seqnum,
row_number() over (partition by status order by date) as seqnum_s
from t
) t
group by status, (seqnum - seqnum_s);