根据另一行中的值排除 SQL 查询中的行,同时保留单个零件 ID 的多个输出
Excluding rows in a SQL query based on values in another row while preserving multiple outputs of a single part ID
我有以下形式的机器输出数据:
DATETIME ID VALUE
8-28-20 20:55:10 part1 13
8-28-20 20:56:60 part1 20
8-28-20 20:57:22 part1 25
8-28-20 20:59:39 part2 9
8-28-20 21:10:55 part3 33
8-28-20 21:14:30 part1 14
我需要通过删除一些行来生成新的 table:
DATETIME ID VALUE
8-28-20 20:57:22 part1 25
8-28-20 20:59:39 part2 9
8-28-20 21:10:55 part3 33
8-28-20 21:14:30 part1 14
机器有时会为每个 运行 收集多个 VALUE,但我只需要最后一个(它是累积的)。但是,我可能每个班次都有多个相同 ID 的 运行,而且连续 2 个相同 ID 的 运行 也不是不可能。
是否可以使用 SQL 过滤掉行 ID 等于其上一行 ID 的所有行,前提是 VALUE 大于其上一行的 VALUE?
这里发布了一些类似的问题,但它们都会导致对行进行分组并取最大值,但我会在每个时间段每个 ID 只捕获一个 运行。
您可以尝试以下方法 - 使用 row_number()
select * from
(
select *,
row_number() over(partition by dateadd(hour, datediff(hour, 0, DATETIME), 0), id order by DATETIME desc) as rn
from tablename
)A where rn=1
您似乎想要 id
发生变化且值增加的行:
select t.*
from (select t.*,
lead(id) over (order by datetime) as next_id,
lead(value) over (order by datetime) as next_value
from t
) t
where next_id is null or next_id <> id or
(next_id = id and next_value < value)
更通用一点,也作为一个例子来获取一个没有特定 OLAP 函数的会话 ID:
WITH
-- your input
input(dttm,id,value) AS (
SELECT TIMESTAMP '2020-08-28 20:55:10','part1',13
UNION ALL SELECT TIMESTAMP '2020-08-28 20:56:60','part1',20
UNION ALL SELECT TIMESTAMP '2020-08-28 20:57:22','part1',25
UNION ALL SELECT TIMESTAMP '2020-08-28 20:59:39','part2',9
UNION ALL SELECT TIMESTAMP '2020-08-28 21:10:55','part3',33
UNION ALL SELECT TIMESTAMP '2020-08-28 21:14:30','part1',14
)
,
-- add a counter that is at 1 whenever the id changes over time
with_chg AS (
SELECT
CASE
WHEN LAG(id) OVER(ORDER BY dttm) <> id THEN 1
ELSE 0
END AS chg_count
, *
FROM input
)
,
-- use the running sum of that change counter to get a session id
with_session AS (
SELECT
SUM(chg_count) OVER(ORDER BY dttm) AS session_id
, dttm
, id
, value
FROM with_chg
)
,
-- partition by the session id, order by datetime descending to get
-- the row number of 1 for the right row
with_rownum AS (
SELECT
ROW_NUMBER() OVER(PARTITION BY session_id ORDER BY dttm DESC) AS rownum
, dttm
, id
, value
FROM with_session
)
-- finally, filter by row number 1 and order back by datetime
SELECT
dttm
, id
, value
FROM with_rownum
WHERE rownum = 1
ORDER BY 1
;
-- out dttm | id | value
-- out ---------------------+-------+-------
-- out 2020-08-28 20:57:22 | part1 | 25
-- out 2020-08-28 20:59:39 | part2 | 9
-- out 2020-08-28 21:10:55 | part3 | 33
-- out 2020-08-28 21:14:30 | part1 | 14
我有以下形式的机器输出数据:
DATETIME ID VALUE
8-28-20 20:55:10 part1 13
8-28-20 20:56:60 part1 20
8-28-20 20:57:22 part1 25
8-28-20 20:59:39 part2 9
8-28-20 21:10:55 part3 33
8-28-20 21:14:30 part1 14
我需要通过删除一些行来生成新的 table:
DATETIME ID VALUE
8-28-20 20:57:22 part1 25
8-28-20 20:59:39 part2 9
8-28-20 21:10:55 part3 33
8-28-20 21:14:30 part1 14
机器有时会为每个 运行 收集多个 VALUE,但我只需要最后一个(它是累积的)。但是,我可能每个班次都有多个相同 ID 的 运行,而且连续 2 个相同 ID 的 运行 也不是不可能。
是否可以使用 SQL 过滤掉行 ID 等于其上一行 ID 的所有行,前提是 VALUE 大于其上一行的 VALUE?
这里发布了一些类似的问题,但它们都会导致对行进行分组并取最大值,但我会在每个时间段每个 ID 只捕获一个 运行。
您可以尝试以下方法 - 使用 row_number()
select * from
(
select *,
row_number() over(partition by dateadd(hour, datediff(hour, 0, DATETIME), 0), id order by DATETIME desc) as rn
from tablename
)A where rn=1
您似乎想要 id
发生变化且值增加的行:
select t.*
from (select t.*,
lead(id) over (order by datetime) as next_id,
lead(value) over (order by datetime) as next_value
from t
) t
where next_id is null or next_id <> id or
(next_id = id and next_value < value)
更通用一点,也作为一个例子来获取一个没有特定 OLAP 函数的会话 ID:
WITH
-- your input
input(dttm,id,value) AS (
SELECT TIMESTAMP '2020-08-28 20:55:10','part1',13
UNION ALL SELECT TIMESTAMP '2020-08-28 20:56:60','part1',20
UNION ALL SELECT TIMESTAMP '2020-08-28 20:57:22','part1',25
UNION ALL SELECT TIMESTAMP '2020-08-28 20:59:39','part2',9
UNION ALL SELECT TIMESTAMP '2020-08-28 21:10:55','part3',33
UNION ALL SELECT TIMESTAMP '2020-08-28 21:14:30','part1',14
)
,
-- add a counter that is at 1 whenever the id changes over time
with_chg AS (
SELECT
CASE
WHEN LAG(id) OVER(ORDER BY dttm) <> id THEN 1
ELSE 0
END AS chg_count
, *
FROM input
)
,
-- use the running sum of that change counter to get a session id
with_session AS (
SELECT
SUM(chg_count) OVER(ORDER BY dttm) AS session_id
, dttm
, id
, value
FROM with_chg
)
,
-- partition by the session id, order by datetime descending to get
-- the row number of 1 for the right row
with_rownum AS (
SELECT
ROW_NUMBER() OVER(PARTITION BY session_id ORDER BY dttm DESC) AS rownum
, dttm
, id
, value
FROM with_session
)
-- finally, filter by row number 1 and order back by datetime
SELECT
dttm
, id
, value
FROM with_rownum
WHERE rownum = 1
ORDER BY 1
;
-- out dttm | id | value
-- out ---------------------+-------+-------
-- out 2020-08-28 20:57:22 | part1 | 25
-- out 2020-08-28 20:59:39 | part2 | 9
-- out 2020-08-28 21:10:55 | part3 | 33
-- out 2020-08-28 21:14:30 | part1 | 14