从 Redshift 获取数据

Get data clubbed from Redshift

我正在尝试以某种方式组织数据。这是我正在尝试做的事情

我在 Redshift 中有一个 table,我们试图从中获得以下输出

Table: foo

e1 | c1 | c2
1  | 1  | 2
1  | 3  | 4
1  | 5  | 7
1  | 9  | 15
2  | 3  | 4
2  | 7  | 8

我们试图合并所有前一行 c2 和下一行 c1 之间的差异小于 1 的行

期望的输出

e1 | c1 | c2
1  | 1  | 7
1  | 9  | 15
2  | 3  | 4
2  | 7  | 8

当前输出

e1 | c1 | c2
1  | 1  | 4
1  | 3  | 7
2  | 3  | 4
2  | 7  | 8

我试过做CTE。这是我正在处理的查询。我得到的结果是孤立的,或者

CTE:

with es as(
select *
from foo
where e1 not in (SELECT t1.e1
  FROM foo as t1 
  inner join foo as t2
  on t1.e1=t2.e1 and (t2.c1-t1.c2)=1)
union all
SELECT t1.e1
      ,t1.c1
      ,isnull(t2.c2, t1.c2) as c2
  FROM foo as t1 
  inner join foo as t2
  on t1.e1=t2.e1 and (t2.c1-t1.c2)=1 
 )
 select * from es
 where e1 is not null

有人可以帮我吗?

我猜你的意思是 "We are trying to club all the rows where difference between previous row c2 next row c1 is less than 1 where e1 is the same"。

您可以使用 Window Functions for that. LEAD 为您提供以下 c1(前提是顺序正确),然后您可以对其进行过滤:

SELECT
  e1,
  c1,
  c2
FROM (
       SELECT
         e1,
         c1,
         c2,
         LEAD(c1, 1)
         OVER (PARTITION BY e1
           ORDER BY e1 ASC, c1 ASC, c2 ASC) AS lead_c1
       FROM so_test
       ORDER BY e1 ASC, c1 ASC, c2 ASC) AS with_lead
WHERE lead_c1 - c2 != 1 OR lead_c1 IS NULL

输出:

e1|c1|c2
1 |5 |7
1 |9 |15
2 |3 |4
2 |7 |8

在不了解 table 结构的情况下,我不得不按所有列排序以确保行的顺序与您发布的顺序相同。如果您有另一个键(如排序键),最好使用它。

如果我关于 e1 相同 的假设是错误的,请删除 "PARTITION BY e1"。