根据与上一行值的差异跳过 bigquery 中的行
Skip rows in bigquery based on difference from value in previous row
假设下面的 table 是按值排序的 (DESC),我如何才能 return 只有当前值与上一行中的值之差小于某个数字 x (例如 2),并在第一次满足此条件后丢弃下一行
即return下面只有第1行和第2行,因为第3行和第2行的值相差(9.0-4.0=5.0)>2,所以我们跳过第3行和第4行
with table as (
select 1 as id, "a" as name, 10.0 as value UNION ALL
select 2, "b", 9.0 UNION ALL
select 3, "c", 4.0 UNION ALL
select 4, "d", 1.0 UNION ALL
)
输出
id, name, value
1, a, 10.0
2, b, 9.0
我们可以使用 lag() 求差,结合 id<=id 和 max(difference)<=2 来过滤结果。
with t1 as (
select 1 as id, 'a' as name, 10.0 as value UNION ALL
select 2, 'b', 9.0 UNION ALL
select 3, 'c', 4.0 UNION ALL
select 4, 'd', 1.0
)
select
a.id, a.name, a.value,
max(b.value_diff) max_diff
from t1 a
join (select id, abs(coalesce(value - lag(value) over (order by id),0)) as value_diff from t1 )b
on a.id >= b.id
group by a.id, a.name, a.value
having max(b.value_diff) <= 2;
假设下面的 table 是按值排序的 (DESC),我如何才能 return 只有当前值与上一行中的值之差小于某个数字 x (例如 2),并在第一次满足此条件后丢弃下一行
即return下面只有第1行和第2行,因为第3行和第2行的值相差(9.0-4.0=5.0)>2,所以我们跳过第3行和第4行
with table as (
select 1 as id, "a" as name, 10.0 as value UNION ALL
select 2, "b", 9.0 UNION ALL
select 3, "c", 4.0 UNION ALL
select 4, "d", 1.0 UNION ALL
)
输出
id, name, value
1, a, 10.0
2, b, 9.0
我们可以使用 lag() 求差,结合 id<=id 和 max(difference)<=2 来过滤结果。
with t1 as (
select 1 as id, 'a' as name, 10.0 as value UNION ALL
select 2, 'b', 9.0 UNION ALL
select 3, 'c', 4.0 UNION ALL
select 4, 'd', 1.0
)
select
a.id, a.name, a.value,
max(b.value_diff) max_diff
from t1 a
join (select id, abs(coalesce(value - lag(value) over (order by id),0)) as value_diff from t1 )b
on a.id >= b.id
group by a.id, a.name, a.value
having max(b.value_diff) <= 2;