排除组 SQL Teradata 中特定列值之前的行
Exclude rows that come before a specific column value in group SQL Teradata
我在 Teradata 中有以下结构的 table。我的分析在数据上传后开始,即状态列中的状态为 'UPLOADED'。我想排除每个组中状态变为 'UPLOADED' 之前的行。所有事件都是实时发生的,即每个新事件的时间戳值都会增加。
输入table数据-->table_input
ID STATUS TMP
A1 UPLOADED 06/16/2021 08:38:44.535
A1 A 06/16/2021 16:20:40.014
A1 (B) 06/16/2021 17:15:36.488
A1 C 06/16/2021 17:15:36.846
A1 A 06/16/2021 17:15:36.883
B1 A2 06/16/2021 08:34:09.974
B1 L 06/16/2021 08:34:10.271
B1 L 06/16/2021 14:44:33.677
B1 (R) 06/16/2021 14:52:21.812
B1 UPLOADED 06/16/2021 16:05:36.346
B1 AP 06/16/2021 16:05:36.499
B1 (R) 06/16/2021 16:05:36.718
C1 C 06/16/2021 16:05:36.764
C1 UPLOADED 06/16/2021 08:49:43.796
C1 UPLOADED 06/16/2021 08:49:43.841
C1 L 06/16/2021 14:50:39.667
C1 UPLOADED 06/16/2021 14:52:50.149
C1 (R) 06/16/2021 16:05:43.998
预期输出:结果中排除了状态为 'UPLOADED' 之前的数据。
ID STATUS TMP
A1 UPLOADED 06/16/2021 08:38:44.535
A1 A 06/16/2021 16:20:40.014
A1 (B) 06/16/2021 17:15:36.488
A1 C 06/16/2021 17:15:36.846
A1 A 06/16/2021 17:15:36.883
B1 UPLOADED 06/16/2021 16:05:36.346
B1 AP 06/16/2021 16:05:36.499
B1 (R) 06/16/2021 16:05:36.718
C1 UPLOADED 06/16/2021 08:49:43.796
C1 UPLOADED 06/16/2021 08:49:43.841
C1 L 06/16/2021 14:50:39.667
C1 UPLOADED 06/16/2021 14:52:50.149
C1 (R) 06/16/2021 16:05:43.998
我正在使用 Teradata SQL,但 SQL 服务器 SQL 应该也可以。
我正在尝试使用 window 函数,但还没有取得任何成功。
我们还可以将 TMP 列与 TimeStamp 一起使用并编写逻辑代码,例如排除 TimeStamp 值小于 TimeStamp of First occurrence of 'UPLOADED'.
的所有行
您可以在 Teradata 中使用 qualify
子句:
select t.*
from t
qualify tmp >= min(case when status = 'UPLOADED' then tmp end) over (partition by id);
而且,虽然您可以为此使用 window 函数,但也可以使用相关子查询来编写它:
select t.*
from t
where t.tmp > (select min(t2.tmp)
from t t2
where t2.id = t.id and t2.status = 'UPLOADED'
);
我在 Teradata 中有以下结构的 table。我的分析在数据上传后开始,即状态列中的状态为 'UPLOADED'。我想排除每个组中状态变为 'UPLOADED' 之前的行。所有事件都是实时发生的,即每个新事件的时间戳值都会增加。
输入table数据-->table_input
ID STATUS TMP
A1 UPLOADED 06/16/2021 08:38:44.535
A1 A 06/16/2021 16:20:40.014
A1 (B) 06/16/2021 17:15:36.488
A1 C 06/16/2021 17:15:36.846
A1 A 06/16/2021 17:15:36.883
B1 A2 06/16/2021 08:34:09.974
B1 L 06/16/2021 08:34:10.271
B1 L 06/16/2021 14:44:33.677
B1 (R) 06/16/2021 14:52:21.812
B1 UPLOADED 06/16/2021 16:05:36.346
B1 AP 06/16/2021 16:05:36.499
B1 (R) 06/16/2021 16:05:36.718
C1 C 06/16/2021 16:05:36.764
C1 UPLOADED 06/16/2021 08:49:43.796
C1 UPLOADED 06/16/2021 08:49:43.841
C1 L 06/16/2021 14:50:39.667
C1 UPLOADED 06/16/2021 14:52:50.149
C1 (R) 06/16/2021 16:05:43.998
预期输出:结果中排除了状态为 'UPLOADED' 之前的数据。
ID STATUS TMP
A1 UPLOADED 06/16/2021 08:38:44.535
A1 A 06/16/2021 16:20:40.014
A1 (B) 06/16/2021 17:15:36.488
A1 C 06/16/2021 17:15:36.846
A1 A 06/16/2021 17:15:36.883
B1 UPLOADED 06/16/2021 16:05:36.346
B1 AP 06/16/2021 16:05:36.499
B1 (R) 06/16/2021 16:05:36.718
C1 UPLOADED 06/16/2021 08:49:43.796
C1 UPLOADED 06/16/2021 08:49:43.841
C1 L 06/16/2021 14:50:39.667
C1 UPLOADED 06/16/2021 14:52:50.149
C1 (R) 06/16/2021 16:05:43.998
我正在使用 Teradata SQL,但 SQL 服务器 SQL 应该也可以。 我正在尝试使用 window 函数,但还没有取得任何成功。 我们还可以将 TMP 列与 TimeStamp 一起使用并编写逻辑代码,例如排除 TimeStamp 值小于 TimeStamp of First occurrence of 'UPLOADED'.
的所有行您可以在 Teradata 中使用 qualify
子句:
select t.*
from t
qualify tmp >= min(case when status = 'UPLOADED' then tmp end) over (partition by id);
而且,虽然您可以为此使用 window 函数,但也可以使用相关子查询来编写它:
select t.*
from t
where t.tmp > (select min(t2.tmp)
from t t2
where t2.id = t.id and t2.status = 'UPLOADED'
);