从 Redshift 获取数据
Get data clubbed from Redshift
我正在尝试以某种方式组织数据。这是我正在尝试做的事情
我在 Redshift 中有一个 table,我们试图从中获得以下输出
Table: foo
e1 | c1 | c2
1 | 1 | 2
1 | 3 | 4
1 | 5 | 7
1 | 9 | 15
2 | 3 | 4
2 | 7 | 8
我们试图合并所有前一行 c2 和下一行 c1 之间的差异小于 1 的行
期望的输出
e1 | c1 | c2
1 | 1 | 7
1 | 9 | 15
2 | 3 | 4
2 | 7 | 8
当前输出
e1 | c1 | c2
1 | 1 | 4
1 | 3 | 7
2 | 3 | 4
2 | 7 | 8
我试过做CTE。这是我正在处理的查询。我得到的结果是孤立的,或者
CTE:
with es as(
select *
from foo
where e1 not in (SELECT t1.e1
FROM foo as t1
inner join foo as t2
on t1.e1=t2.e1 and (t2.c1-t1.c2)=1)
union all
SELECT t1.e1
,t1.c1
,isnull(t2.c2, t1.c2) as c2
FROM foo as t1
inner join foo as t2
on t1.e1=t2.e1 and (t2.c1-t1.c2)=1
)
select * from es
where e1 is not null
有人可以帮我吗?
我猜你的意思是 "We are trying to club all the rows where difference between previous row c2 next row c1 is less than 1 where e1 is the same"。
您可以使用 Window Functions for that. LEAD 为您提供以下 c1(前提是顺序正确),然后您可以对其进行过滤:
SELECT
e1,
c1,
c2
FROM (
SELECT
e1,
c1,
c2,
LEAD(c1, 1)
OVER (PARTITION BY e1
ORDER BY e1 ASC, c1 ASC, c2 ASC) AS lead_c1
FROM so_test
ORDER BY e1 ASC, c1 ASC, c2 ASC) AS with_lead
WHERE lead_c1 - c2 != 1 OR lead_c1 IS NULL
输出:
e1|c1|c2
1 |5 |7
1 |9 |15
2 |3 |4
2 |7 |8
在不了解 table 结构的情况下,我不得不按所有列排序以确保行的顺序与您发布的顺序相同。如果您有另一个键(如排序键),最好使用它。
如果我关于 e1 相同 的假设是错误的,请删除 "PARTITION BY e1"。
我正在尝试以某种方式组织数据。这是我正在尝试做的事情
我在 Redshift 中有一个 table,我们试图从中获得以下输出
Table: foo
e1 | c1 | c2 1 | 1 | 2 1 | 3 | 4 1 | 5 | 7 1 | 9 | 15 2 | 3 | 4 2 | 7 | 8
我们试图合并所有前一行 c2 和下一行 c1 之间的差异小于 1 的行
期望的输出
e1 | c1 | c2 1 | 1 | 7 1 | 9 | 15 2 | 3 | 4 2 | 7 | 8
当前输出
e1 | c1 | c2 1 | 1 | 4 1 | 3 | 7 2 | 3 | 4 2 | 7 | 8
我试过做CTE。这是我正在处理的查询。我得到的结果是孤立的,或者
CTE:
with es as(
select *
from foo
where e1 not in (SELECT t1.e1
FROM foo as t1
inner join foo as t2
on t1.e1=t2.e1 and (t2.c1-t1.c2)=1)
union all
SELECT t1.e1
,t1.c1
,isnull(t2.c2, t1.c2) as c2
FROM foo as t1
inner join foo as t2
on t1.e1=t2.e1 and (t2.c1-t1.c2)=1
)
select * from es
where e1 is not null
有人可以帮我吗?
我猜你的意思是 "We are trying to club all the rows where difference between previous row c2 next row c1 is less than 1 where e1 is the same"。
您可以使用 Window Functions for that. LEAD 为您提供以下 c1(前提是顺序正确),然后您可以对其进行过滤:
SELECT
e1,
c1,
c2
FROM (
SELECT
e1,
c1,
c2,
LEAD(c1, 1)
OVER (PARTITION BY e1
ORDER BY e1 ASC, c1 ASC, c2 ASC) AS lead_c1
FROM so_test
ORDER BY e1 ASC, c1 ASC, c2 ASC) AS with_lead
WHERE lead_c1 - c2 != 1 OR lead_c1 IS NULL
输出:
e1|c1|c2
1 |5 |7
1 |9 |15
2 |3 |4
2 |7 |8
在不了解 table 结构的情况下,我不得不按所有列排序以确保行的顺序与您发布的顺序相同。如果您有另一个键(如排序键),最好使用它。
如果我关于 e1 相同 的假设是错误的,请删除 "PARTITION BY e1"。