检查前几行以查看值是否存在
Checking previous rows to see if value exists
对于每个客户 ID 以及 TS,数据集都有几行。
对于每个客户 ID,我想检查 Status
列以检查它是否曾在该客户 ID 的任何先前行中包含 Cancelled
值。 (按TS排序)
CustomerID
Status
TS
Vimes
CANCELLED
Jan 1
Vimes
ACTIVE
Jan 2
Vimes
CANCELLED
Jan 3
Sybill
ACTIVE
Jan 2
Sybill
ACTIVE
Jan 5
Sybill
ACTIVE
Jan 6
结果集应该是这样的,添加了一个带有标志 Rejoiner
的列,用于检查 Status
列的先前值 -
CustomerID
Status
TS
Rejoiner
Vimes
CANCELLED
Jan 1
No
Vimes
ACTIVE
Jan 2
Yes
Vimes
CANCELLED
Jan 3
Yes
Sybill
ACTIVE
Jan 2
No
Sybill
ACTIVE
Jan 5
No
Sybill
CANCELLED
Jan 6
No
下面使用
select *,
if(countif(status = 'CANCELLED') over win > 0, 'Yes', 'No') as Rejoiner
from your_table
window win as (partition by customerid order by unix_date(date(ts)) range between unbounded preceding and 1 preceding)
如果应用于您问题中的示例数据 - 输出为
对于每个客户 ID 以及 TS,数据集都有几行。
对于每个客户 ID,我想检查 Status
列以检查它是否曾在该客户 ID 的任何先前行中包含 Cancelled
值。 (按TS排序)
CustomerID | Status | TS |
---|---|---|
Vimes | CANCELLED | Jan 1 |
Vimes | ACTIVE | Jan 2 |
Vimes | CANCELLED | Jan 3 |
Sybill | ACTIVE | Jan 2 |
Sybill | ACTIVE | Jan 5 |
Sybill | ACTIVE | Jan 6 |
结果集应该是这样的,添加了一个带有标志 Rejoiner
的列,用于检查 Status
列的先前值 -
CustomerID | Status | TS | Rejoiner |
---|---|---|---|
Vimes | CANCELLED | Jan 1 | No |
Vimes | ACTIVE | Jan 2 | Yes |
Vimes | CANCELLED | Jan 3 | Yes |
Sybill | ACTIVE | Jan 2 | No |
Sybill | ACTIVE | Jan 5 | No |
Sybill | CANCELLED | Jan 6 | No |
下面使用
select *,
if(countif(status = 'CANCELLED') over win > 0, 'Yes', 'No') as Rejoiner
from your_table
window win as (partition by customerid order by unix_date(date(ts)) range between unbounded preceding and 1 preceding)
如果应用于您问题中的示例数据 - 输出为